Open Access
This article is

- freely available
- re-usable

*Appl. Sci.*
**2018**,
*8*(11),
2317;
doi:10.3390/app8112317

Article

Impulse Noise Denoising Using Total Variation with Overlapping Group Sparsity and Lp-Pseudo-Norm Shrinkage

School of Physics and Information Engineering, Minnan Normal University, Zhangzhou 363000, China

^{*}

Author to whom correspondence should be addressed.

Received: 16 October 2018 / Accepted: 5 November 2018 / Published: 20 November 2018

## Abstract

**:**

## Featured Application

**This paper proposes a new model for restoring polluted images under impulse noise, which makes a contribution to the research in image processing and image reconstruction.**

## Abstract

Models based on total variation (TV) regularization are proven to be effective in removing random noise. However, the serious staircase effect also exists in the denoised images. In this study, two-dimensional total variation with overlapping group sparsity (OGS-TV) is applied to images with impulse noise, to suppress the staircase effect of the TV model and enhance the dissimilarity between smooth and edge regions. In the traditional TV model, the L1-norm is always used to describe the statistics characteristic of impulse noise. In this paper, the Lp-pseudo-norm regularization term is employed here to replace the L1-norm. The new model introduces another degree of freedom, which better describes the sparsity of the image and improves the denoising result. Under the accelerated alternating direction method of multipliers (ADMM) framework, Fourier transform technology is introduced to transform the matrix operation from the spatial domain to the frequency domain, which improves the efficiency of the algorithm. Our model concerns the sparsity of the difference domain in the image: the neighborhood difference of each point is fully utilized to augment the difference between the smooth and edge regions. Experimental results show that the peak signal-to-noise ratio, the structural similarity, the visual effect, and the computational efficiency of this new model are improved compared with state-of-the-art denoising methods.

Keywords:

overlapping group sparsity; Lp-pseudo-norm; accelerated alternating multiplier iterative method; impulse noise denoising## 1. Introduction

Image denoising is one of the most important research areas in the field of image processing, and it has great value in both theoretical studies and engineering applications. Its usage spans the broad fields of image restoration [1], detection [2], photoelectric detection [3], geological exploration [4], remote sensing [5], and medical image analysis [6], among others [7,8]. With the development of compressed sensing theory, image processing algorithms based on sparse representation and constrained regularization have evolved into promising methods of image restoration [9]. Models based on total variation (TV) regularization [10,11,12] are found to be effective in removing random noise. The TV model has been successfully used in image restoration tasks such as denoising [13], deblurring [14], and super-resolution [15]. Although TV regularization can recover sharp edges of a degraded image, it also leads to some undesired effects and transforms the smooth signal into piecewise constants, the so-called staircase effect. Several models have been proposed by scholars to make an improvement on the TV model [16,17,18,19,20,21]. One usual method is to replace the original TV norm by a high-order TV norm. The high-order TV overcomes the staircase effect while preserving the edges in the restored image. However, the high-order TV based methods may transform the smooth signal to over-smoothing and take more time to compute. More details can be referred to in References [22]. In 2010, Bredies et al. proposed the total generalized variation (TGV) model [19]. The TGV model puts constraints on both the first- and second-order gradients of an image, thus effectively attenuating the staircase effect of the TV model. Still, it is difficult to both preserve the image details and suppress the noise simultaneously in TGV. Furthermore, some scholars pay attention to fractional-order gradients replacing integer-order gradients [23]. Their research shows that using a fractional differential operator with 0 < v < 1 can appropriately process the noise and edge information, but it also has un-denoised “spots” in the image.

Although these improved methods can alleviate the staircase artifacts, they might lead to “spots” effects on the processed image. How to choose a good regularization functional is a key point in imaging science, in order to balance the staircase artifacts and “spots” effects. Recently, Selesnick and Chen proposed the total variation with overlapping group sparsity (OGS-TV) [24,25,26,27,28], which introduces the concept of the group gradient into the TV model and takes into full consideration the dissimilarity between smooth and edge regions. The OGS-TV model can distinguish the individual noise point and image edge point, so it greatly alleviates the staircase effect. Based on this work, Liu et al. applied this method in the removal of speckle noise. Wu and Du applied the OGS model in the field of Magnetic Resonance (MR) image reconstruction [27]. In this paper, we introduce the OGS model into the denoising of impulse noises.

In the typical denoising method, the L1-norm is commonly used as the fidelity term of impulse noise. However, the solution to the L1-norm usually involves the soft-thresholding function, which reduces large values by a constant amount. As a result, noise signals are estimated systematically underestimated for large signal values [26]. To improve this shortcoming, many non-convex reconstruction methods are proposed. Non-convex regularizers have also shown to exhibit a sparser solution than the L1 regularizer [24,29,30]. Inspired by their research, we propose a total variation model based on overlapping group sparsity and Lp-pseudo-norm shrinkage (called OGS-Lp for short). Compared with the L1-norm, the Lp-pseudo-norm adds another degree of freedom to the model, which better characterizes the sparsity features of the image [31].

To solve the problem, the alternating multiplier iterative method (ADMM) [32] and the majorization-minimization (MM) algorithm [33] were used to split the complex problem into several subproblems. Furthermore, an accelerated ADMM with a restart [34] is used to solve the new model (OGS-Lp-FAST for short). In this way, a large amount of spatial-domain calculations are transferred to the frequency domain, which significantly reduces the complexity of the algorithm and speeds up its convergence.

The anisotropic total variation (ATV), isotropic total variation (ITV), total generalized variation (TGV), overlapping group sparsity with L1-norm (OGS-L1), overlapping group sparsity with pseudo-norm (OGS-Lp), and our method are compared experimentally using criteria such as peak signal-to-noise ratio (PSNR), structural similarity (SSIM), and runtime. The model and algorithm proposed here could further improve the image denoising performance.

The contributions of this study are as follows: (1) By introducing OGS into TV, a new regularization was proposed, which incorporated the advantages of TV and OGS models. In the OGS-TV model, the neighborhood difference of each point is fully utilized to augment the difference between the smooth and edge regions. It could balance the staircase artifacts and “spots” effects well. (2) We adopt the Lp-pseudo-norm instead of the L1-norm to describe the fidelity term of impulse noise, extending the L1-norm-based OGS-TV to the OGS-Lp model. (3) The ADMM framework was employed to solve the proposed model. In the ADMM framework, the complex multi-constraints optimization problem is changed to several decoupled subproblems, which are easier to solve. Fourier transform technology is introduced to transform the matrix operation from the spatial domain to the frequency domain, which avoids large-scale matrix calculations. (4) In order to achieve faster convergence speed, the rapid ADMM with a restart process is adopted to improve the speed of the proposed algorithm. This improved model is named OGS-Lp-FAST. The rate of convergence of the model increases from $O(\frac{1}{k})$ to $O(\frac{1}{{k}^{2}})$.

This paper is organized as follows: Section 2 gives a review of the traditional TV model; Section 3 describes the incorporation of overlapping group sparsity and Lp-pseudo-norm shrinkage into the TV model, and uses accelerated ADMM with a restart to solve the new model; Section 4 seeks to validate the proposed algorithm with standard images, and compares it with four other models; and Section 5 summarizes this paper and proposes future work.

## 2. Traditional TV Model

An image could contain many types of noise. According to the distribution condition of probability density function (PDF) of their amplitude, noises are classified into Gaussian noise, Rayleigh noise, uniform noise, exponential noise, impulse noise, gamma noise, etc.

In this paper, the discussion focuses on impulse noise denoising of images. Impulse noise is additive and is mainly caused by black-and-white bright and dark spots produced by image sensors, transmission channels, decoding processes, etc. In 2004, Nikolova [11] proposed the use of a L1 data-fidelity term for impulse noise related problems. Since then, many research papers adopted this model in their characterization of this type of noise [35,36,37]. The ATV model of impulse noise based on this model is:
where $\mathit{G}\in {\mathbb{R}}^{M\times M}$ is the image with noise and $\mathit{F}\in {\mathbb{R}}^{M\times M}$ is the denoised image. With ‖●‖
where * represents the convolution, and ${\mathit{K}}_{h}=[-1{,}^{}1]$ with ${\mathit{K}}_{v}=\left[\begin{array}{c}-1\\ 1\end{array}\right]$ are the differential operators used for convolution operations in horizontal and vertical directions, respectively.

$$\mathit{F}=\underset{\mathit{F}}{\mathrm{arg}\mathrm{min}}{\Vert \mathit{F}-\mathit{G}\Vert}_{1}+\mu {R}_{ATV}\left(\mathit{F}\right).\text{}$$

_{1}being the L1-norm, the first term in Equation (1), i.e., $\underset{\mathit{F}}{argmin}{\Vert \mathit{F}-\mathit{G}\Vert}_{1}$ is called the fidelity term, and the second term $\mu {R}_{ATV}\left(\mathit{F}\right)$ is the sparsity regularization term, which includes the prior sparsity information of the image. μ is a regularization parameter for weighing between the fidelity and regularization terms. The image restoration problem can then be solved by finding the smallest value of $\mathit{F}$ satisfying the conditions, so that Equation (1) is valid. Since the regularization term of the anisotropic total variation model needs to ensure the minimization of both horizontal and vertical gradients, ${R}_{ATV}\left(\mathit{F}\right)$ can be defined as
$${R}_{ATV}\left(\mathit{F}\right)=\Vert {\mathit{K}}_{h}*\mathit{F}\Vert {}_{1}+\Vert {\mathit{K}}_{v}*\mathit{F}\Vert {}_{1}$$

## 3. Proposed Method

#### 3.1. Overlapping Group Sparsity with L1 Norm (OGS-L1) Model

To reduce the staircase effect of the ATV model, Selesnick and Chen proposed the overlapping group sparsity regularization term [5,6,7] in 2006, which expands the vertical and horizontal gradient of pixels to the group gradient of $N$ adjacent points ($N$ is the size of the group). Setting a reasonable threshold, the individual noise points and image edge points can be distinguished. This model preserves the edge information of the image and mitigates the disadvantages of the staircase effect. With reference to the work of Selesnick and Chen, Liu et al. extended the overlapping group sparsity regularizer from one-dimensional to two-dimensional cases, then substituted it in the anisotropic total variation model for the deconvolution and removal of salt and pepper noise [8].

This model is centered at the pixel ${x}_{i,j}$ and extends into all directions, forming multiple staggered and overlapped squares. The variable ${\stackrel{~}{\mathit{X}}}_{i,j,N,N}\in {\mathbb{R}}^{N\times N}$ is the $N\times N$ pixel matrix, centered at the coordinates $(i,j)$, as shown in Equation (3):
where ${N}_{l}=\lfloor \frac{N-1}{2}\rfloor $, ${N}_{r}=\lfloor \frac{N}{2}\rfloor $. and ⌊ ⌋ is the rounding-down operators. We define $\phi (\mathit{X})$ to represent the overlapping group sparsity functional of the two-dimensional array:

$${\stackrel{~}{\mathit{X}}}_{i,j,N,N}=\left[\begin{array}{cccc}{x}_{i-{N}_{l},j-{N}_{l}}& {x}_{i-{N}_{l},j-{N}_{l}+1}& \cdots & {x}_{i-{N}_{l},j+{N}_{r}}\\ {x}_{i-{N}_{l}+1,j-{N}_{l}}& {x}_{i-{N}_{l}+1,j-{N}_{l}+1}& \cdots & {x}_{i-{N}_{l}+1,j+{N}_{r}}\\ \vdots & \vdots & \ddots & \vdots \\ {x}_{i+{N}_{r},j-{N}_{l}}& {x}_{i+{N}_{r},j-{N}_{l}+1}& \cdots & {x}_{i+{N}_{r},j+{N}_{r}}\end{array}\right],$$

$$\phi \left(\mathit{X}\right)={\displaystyle \sum _{i=1}^{N}{\displaystyle \sum _{j=1}^{N}{\Vert {\stackrel{~}{\mathit{X}}}_{i,j,N,N}\Vert}_{2}}}.$$

The ATV model can then be extended to embody overlapping group sparsity regularization (called OGS-L1 model for short), as shown in Equation (5):
where the regularization term $\phi \left(\mathit{X}\right)={\displaystyle \sum _{i=1}^{N}{\displaystyle \sum _{j=1}^{N}{\Vert {\stackrel{~}{\mathit{X}}}_{i,j,N,N}\Vert}_{2}}}.$ is the group gradient. Equation (4) shows that the OGS-TV model takes into full consideration the gradient information close to a pixel, so it strengthens the dissimilarity between the smooth and edge regions of the image.

$$\mathit{F}=\underset{\mathit{F}}{\mathrm{arg}\mathrm{min}}{\Vert \mathit{F}-\mathit{G}\Vert}_{1}^{}+\mu [\phi \left({\mathit{K}}_{\mathit{h}}*\mathit{F}\right)+\phi \left({\mathit{K}}_{\mathit{v}}*\mathit{F}\right)],\text{}$$

#### 3.2. Overlapping Group Sparsity with Lp-Pseudo-Norm (OSG-Lp) Model

The L1-norm is commonly used as the fidelity term of impulse noise. However, the L1-norm is only the convex relaxation of the L0-norm. The ${p}^{th}$ power of the Lp-norm ($0\le p\le 1$, for simplicity, we name it Lp-pseudo-norm) is another relaxation of the L0-norm. In fact, the L1-norm constraint is a particular case of the Lp-pseudo-norm.

Recently, the Lp-pseudo-norm [38,39,40,41,42] has attracted much attention in academia. Woodworth and Chartrand pointed out that the Lp-quasinorm is better for approximating the original L0-norm than the L1-norm and developed an iterative Lp-quasinorm shrinkage (LpS) solving the problem [43].

The Lp-pseudo-norm makes an improvement on the sparsity-based shrinkage operator by introducing another degree of freedom, thus giving the model a better ability to depict the sparsity of an image in the gradient domain, as shown in Figure 1.

Lp-pseudo-norm contour lines are given in Figure 1, where p = 2 and p = 1 represent L2- and L1-norms, respectively. Assuming that the image is contaminated by impulse noises with an absolute difference of τ, Figure 2 shows the schematic plots of anisotropic total variation contour lines ${\mathrm{R}}_{\mathrm{ApTV}}\left(\mathit{F}\right)={\Vert {\mathit{K}}_{\mathit{h}}*\mathit{F}\Vert}_{p}^{p}+{\Vert {\mathit{K}}_{\mathit{v}}*\mathit{F}\Vert}_{p}^{p}(0<p\le 1)$ intersecting with the fidelity term. As shown in Figure 2, the intersections of the contour lines with the fidelity term are more sparse for $0<p<1$, as shown in Figure 2b, than for $p=1$, as shown in Figure 2a, therefore the robustness of the model against noise is better.

Based on the above analysis, the advantages of Lp-quasinorm regularization are listed as follows: (1) The LpS operator may converge to an accurate solution. (2) The Lp-quasinorm is more flexible than the L1-norm. This might be useful to adapt the degree of sparsity to the signal being processed. (3) The Lp-quasinorm feasible domain makes the solution robust to noise.

Thus, the L1-norm-based OGS-TV could be extended to the Lp-pseudo-norm (abbreviated as OGS-Lp) [38,39,44] and is expressed as follows:

$$\mathit{F}=\underset{\mathit{F}}{\mathrm{arg}\mathrm{min}}{\Vert \mathit{F}-\mathit{G}\Vert}_{p}^{p}+\mu [\phi \left({\mathit{K}}_{\mathit{h}}*\mathit{F}\right)+\phi \left({\mathit{K}}_{\mathit{v}}*\mathit{F}\right)].\text{}$$

## 4. Solution

#### 4.1. Solving the OSG-Lp Model

The OSG-Lp model is treated as a minimization problem whose computation is given below. In real-life images, the values of all pixels usually fall in a limited interval [

**a**,**b**]. For calculation and verification convenience, the image data are normalized so that they are all within [0, 1]. The operator P_{Ω}is first defined with $\Omega =\{\mathit{F}\in {\mathbb{R}}^{M\times M}|0\le {\mathit{F}}_{i,j}\le 1\}$.
$$\text{}{{\rm P}}_{\Omega}\left({\mathit{F}}_{i,j}\right)=\{\begin{array}{c}0\begin{array}{cc},& {\mathit{F}}_{i,j}0,\end{array}\\ {\mathit{F}}_{i,j}\begin{array}{cc},& {\mathit{F}}_{i,j}\in \left[0\uff0c1\right]\end{array}\\ 1\begin{array}{cc},& {\mathit{F}}_{i,j}1,\end{array}\end{array}.\text{}$$

To solve the proposed model, we employed the ADMM framework and changed the complex problem into several subproblems. First, some intermediate variables were introduced to decouple the subproblems. That is, ${\mathit{Z}}_{\mathbf{1}}={\mathit{K}}_{\mathit{h}}\ast \mathit{F}{,}^{}{\mathit{Z}}_{\mathbf{2}}={\mathit{K}}_{\mathit{v}}\ast \mathit{F}{,}^{}{\mathit{Z}}_{\mathbf{3}}=\mathit{F}-\mathit{G}{,}^{}{\mathit{Z}}_{\mathbf{4}}={\mathit{F}}_{}.$ then Equation (6) can be reformulated into the following constrained optimization problem:

$$\begin{array}{l}\mathit{F}=\mathrm{arg}\underset{{\mathit{F}}_{i,j}\in \Omega}{\mathrm{min}}{\Vert {\mathit{Z}}_{\mathbf{3}}\Vert}_{p}^{p}+\mu [\phi \left({\mathit{Z}}_{\mathbf{1}}\right)+\phi \left({\mathit{Z}}_{\mathbf{2}}\right)]\\ \mathit{s}.\mathit{t}.\begin{array}{cc}& {\mathit{Z}}_{\mathbf{1}}={\mathit{K}}_{\mathit{h}}\ast \mathit{F}{,}^{}{\mathit{Z}}_{\mathbf{2}}={\mathit{K}}_{\mathit{v}}\ast \mathit{F}{,}^{}{\mathit{Z}}_{\mathbf{3}}=\mathit{F}-\mathit{G}{,}^{}{\mathit{Z}}_{\mathbf{4}}={\mathit{F}}_{}.\end{array}\end{array}$$

According to the principle of the ADMM framework, the Lagrange multipliers and the quadratic penalty term are needed to establish the augmented function. Then we have the following:
where ${\mathbf{\Lambda}}_{\mathbf{1}},{\mathbf{\Lambda}}_{\mathbf{2}},{\mathbf{\Lambda}}_{\mathbf{3}},{\mathbf{\Lambda}}_{\mathbf{4}}$ are the Lagrange multipliers, and β

$$\begin{array}{l}J\left(\mathit{F},{\mathit{Z}}_{\mathbf{1}},{\mathit{Z}}_{\mathbf{2}},{\mathit{Z}}_{\mathbf{3}},{\mathit{Z}}_{\mathbf{4}},{\beta}_{1},{\beta}_{2},{\beta}_{3},{\beta}_{4}\right)={\Vert {\mathit{Z}}_{\mathbf{3}}\Vert}_{p}^{p}+\mu \left[\phi \left({\mathit{Z}}_{\mathbf{1}}\right)+\phi \left({\mathit{Z}}_{\mathbf{2}}\right)\right]\\ -\langle {\mathbf{\Lambda}}_{\mathbf{1}},\left({\mathit{Z}}_{\mathbf{1}}-{\mathit{K}}_{\mathit{h}}*\mathit{F}\right)\rangle +\frac{{\beta}_{1}}{2}{\Vert {\mathit{Z}}_{\mathbf{1}}-{\mathit{K}}_{\mathit{h}}*\mathit{F}\Vert}_{2}^{2}{}_{}\\ -\langle {\mathbf{\Lambda}}_{\mathbf{2}},\left({\mathit{Z}}_{\mathbf{2}}-{\mathit{K}}_{\mathit{v}}*\mathit{F}\right)\rangle +\frac{{\beta}_{2}}{2}{\Vert {\mathit{Z}}_{\mathbf{2}}-{\mathit{K}}_{\mathit{v}}*\mathit{F}\Vert}_{2}^{2},\\ -\langle {\mathbf{\Lambda}}_{\mathbf{3}},\left({\mathit{Z}}_{\mathbf{3}}-\mathit{F}+\mathit{G}\right)\rangle +\frac{{\beta}_{3}}{2}{\Vert {\mathit{Z}}_{\mathbf{3}}-\mathit{F}+\mathit{G}\Vert}_{2}^{2}\\ -\langle {\mathbf{\Lambda}}_{\mathbf{4}},\left({\mathit{Z}}_{\mathbf{4}}-\mathit{F}\right)\rangle +\frac{{\beta}_{4}}{2}{\Vert {\mathit{Z}}_{\mathbf{4}}-\mathit{F}\Vert}_{2}^{2}\end{array}$$

_{1}, β_{2}, β_{3}, β_{4}> 0 are the penalty coefficients. $\langle \mathit{A},\mathit{B}\rangle $ is the inner-production operator of the matrixes $\mathit{A}$ and $\mathit{B}$.Introducing the scaled Lagrange multipliers ${\tilde{\mathit{Z}}}_{\mathbf{1}},{\tilde{\mathit{Z}}}_{\mathbf{2}},{\tilde{\mathit{Z}}}_{\mathbf{3}},{\tilde{\mathit{Z}}}_{\mathbf{4}}$, which are also called dual variables.

Defined ${\tilde{\mathit{Z}}}_{\mathit{i}}\left(\mathit{i}=1,2,3,4\right)$ as following ${\tilde{\mathit{Z}}}_{\mathbf{1}}=\frac{1}{{\beta}_{1}}{\mathbf{\Lambda}}_{\mathbf{1}},{\tilde{\mathit{Z}}}_{\mathbf{2}}=\frac{1}{{\beta}_{2}}{\mathbf{\Lambda}}_{\mathbf{2}},{\tilde{\mathit{Z}}}_{\mathbf{3}}=\frac{1}{{\beta}_{3}}{\mathbf{\Lambda}}_{\mathbf{3}}$,${\tilde{\mathit{Z}}}_{\mathbf{4}}=\frac{1}{{\beta}_{4}}{\mathbf{\Lambda}}_{\mathbf{4}}$. Add the term $\frac{{\beta}_{i}}{2}{\left({\tilde{\mathit{Z}}}_{\mathit{i}}\right)}^{2}-\frac{{\beta}_{\mathit{i}}}{2}{\left({\tilde{\mathit{Z}}}_{\mathit{i}}\right)}^{2}=0\left(i=1,2,3,4\right)$ to Equation (9) to complete the formula item. Equation (10) is obtained after rearrangement.

$$\begin{array}{l}J\left(\mathit{F},{\mathit{Z}}_{\mathbf{1}},{\mathit{Z}}_{\mathbf{2}},{\mathit{Z}}_{\mathbf{3}},{\mathit{Z}}_{\mathbf{4}},{\beta}_{1},{\beta}_{2},{\beta}_{3},{\beta}_{4}\right)\\ ={\Vert {\mathit{Z}}_{\mathbf{3}}\Vert}_{p}^{p}+\mu \left[\phi \left({\mathit{Z}}_{\mathbf{1}}\right)+\phi \left({\mathit{Z}}_{\mathbf{2}}\right)\right]\\ +\frac{{\beta}_{1}}{2}{\Vert {\mathit{Z}}_{\mathbf{1}}-{\mathit{K}}_{\mathit{h}}*\mathit{F}\Vert}_{2}^{2}-{\beta}_{1}\langle {\tilde{\mathit{Z}}}_{\mathbf{1}},{\mathit{Z}}_{\mathbf{1}}-{\mathit{K}}_{\mathit{h}}*\mathit{F}\rangle +\frac{{\beta}_{1}}{2}{\left({\tilde{\mathit{Z}}}_{\mathbf{1}}\right)}^{2}-\frac{{\beta}_{1}}{2}{\left({\tilde{\mathit{Z}}}_{\mathbf{1}}\right)}^{2}\\ +\frac{{\beta}_{2}}{2}{\Vert {\mathit{Z}}_{\mathbf{2}}-{\mathit{K}}_{\mathit{v}}*\mathit{F}\Vert}_{2}^{2}-{\beta}_{2}\langle {\tilde{\mathit{Z}}}_{\mathbf{2}},{\mathit{Z}}_{\mathbf{2}}-{\mathit{K}}_{\mathit{v}}*\mathit{F}\rangle +\frac{{\beta}_{2}}{2}{\left({\tilde{\mathit{Z}}}_{\mathbf{2}}\right)}^{2}-\frac{{\beta}_{2}}{2}{\left({\tilde{\mathit{Z}}}_{\mathbf{2}}\right)}^{2}\\ +\frac{{\beta}_{3}}{2}{\Vert {\mathit{Z}}_{\mathbf{3}}-\mathit{F}+\mathit{G}\Vert}_{2}^{2}-{\beta}_{3}\langle {\tilde{\mathit{Z}}}_{\mathbf{3}},{\mathit{Z}}_{\mathbf{3}}-\mathit{F}+\mathit{G}\rangle +\frac{{\beta}_{3}}{2}{\left({\tilde{\mathit{Z}}}_{\mathbf{3}}\right)}^{2}-\frac{{\beta}_{3}}{2}{\left({\tilde{\mathit{Z}}}_{\mathbf{3}}\right)}^{2}\\ +\frac{{\beta}_{4}}{2}{\Vert {\mathit{Z}}_{\mathbf{4}}-\mathit{F}\Vert}_{2}^{2}-{\beta}_{4}\langle {\tilde{\mathit{Z}}}_{\mathbf{4}},{\mathit{Z}}_{\mathbf{4}}-\mathit{F}\rangle +\frac{{\beta}_{4}}{2}{\left({\tilde{\mathit{Z}}}_{\mathbf{4}}\right)}^{2}-\frac{{\beta}_{4}}{2}{\left({\tilde{\mathit{Z}}}_{\mathbf{4}}\right)}^{2}{}_{}^{}.\end{array}$$

In Equation (10), the expressions
satisfy the form of ${\mathit{a}}^{2}-2\mathit{a}\mathit{b}+{\mathit{b}}^{2}={\left(\mathit{a}-\mathit{b}\right)}^{2}$, so Equation (10) can be written as Equation (12):

$$\frac{{\beta}_{i}}{2}{\Vert {\mathit{Z}}_{\mathit{i}}-\mathit{A}\Vert}_{2}-{\beta}_{i}\langle {\tilde{\mathit{Z}}}_{\mathit{i}},{\mathit{Z}}_{\mathit{i}}-\mathit{A}\rangle +\frac{{\beta}_{i}}{2}{\left({\tilde{\mathit{Z}}}_{\mathit{i}}\right)}^{2}\begin{array}{cc}& \left(\mathit{i}=1,2,3,4\right)\end{array},\text{}$$

$$\begin{array}{l}J\left(\mathit{F},{\mathit{Z}}_{\mathbf{1}},{\mathit{Z}}_{\mathbf{2}},{\mathit{Z}}_{\mathbf{3}},{\mathit{Z}}_{\mathbf{4}},{\beta}_{1},{\beta}_{2},{\beta}_{3},{\beta}_{4}\right)\\ ={\Vert {\mathit{Z}}_{\mathbf{3}}\Vert}_{p}^{p}+\mu \phi \left({\mathit{Z}}_{\mathbf{1}}\right)+\mu \phi \left({\mathit{Z}}_{\mathbf{2}}\right)+\frac{{\beta}_{1}}{2}{\Vert {\mathit{Z}}_{\mathbf{1}}-{\mathit{K}}_{\mathit{h}}*\mathit{F}-{\tilde{\mathit{Z}}}_{1}\Vert}_{2}^{2}\\ +\frac{{\beta}_{2}}{2}{\Vert {\mathit{Z}}_{\mathbf{2}}-{\mathit{K}}_{\mathit{v}}*\mathit{F}-{\tilde{\mathit{Z}}}_{\mathbf{2}}\Vert}_{2}^{2}+\frac{{\beta}_{3}}{2}{\Vert {\mathit{Z}}_{\mathbf{3}}-\mathit{F}+\mathit{G}-{\tilde{\mathit{Z}}}_{\mathbf{3}}\Vert}_{2}^{2}\\ +\frac{{\beta}_{4}}{2}{\Vert {\mathit{Z}}_{\mathbf{4}}-\mathit{F}-{\tilde{\mathit{Z}}}_{\mathbf{4}}\Vert}_{2}^{2}-\frac{{\beta}_{1}}{2}{\left({\tilde{\mathit{Z}}}_{\mathbf{1}}\right)}^{2}-\frac{{\beta}_{2}}{2}{\left({\tilde{\mathit{Z}}}_{\mathbf{2}}\right)}^{2}-\frac{{\beta}_{3}}{2}{\left({\tilde{\mathit{Z}}}_{\mathbf{3}}\right)}^{2}-\frac{{\beta}_{4}}{2}{\left({\tilde{\mathit{Z}}}_{\mathbf{4}}\right)}^{2}.\end{array}$$

Because ${\mathit{Z}}_{\mathbf{1}},{\mathit{Z}}_{\mathbf{2}},{\mathit{Z}}_{\mathbf{3}},{\mathit{Z}}_{\mathbf{4}},{\tilde{\mathit{Z}}}_{\mathbf{1}},{\tilde{\mathit{Z}}}_{\mathbf{2}},{\tilde{\mathit{Z}}}_{\mathbf{3}},{\tilde{\mathit{Z}}}_{\mathbf{4}}$ are mutually independent, so, they can be solved as some independent subproblems according to the principle of the ADMM algorithm. These problems can be solved by the iterative algorithm for minimizing ${\mathit{Z}}_{\mathit{i}}$.

Defining ${\mathit{Z}}_{\mathit{i}}^{\left(k\right)}{}^{}\left(\mathit{i}=1,2,3,4\right)$ represents the value of ${\mathit{Z}}_{\mathit{i}}^{}$ after $\left(k\right)-th$ iterations. For a given ${\mathit{Z}}_{\mathit{i}}^{\left(k\right)}$, the next iteration ${\mathit{Z}}_{\mathit{i}}^{\left(k+1\right)}$ is generated as follows:

$${\mathit{Z}}_{\mathbf{1}}^{\left(k+1\right)}=\underset{{\mathit{Z}}_{\mathbf{1}}}{\mathrm{arg}\mathrm{min}}\mu \phi \left({\mathit{Z}}_{\mathbf{1}}\right)+\frac{{\beta}_{1}}{2}{\Vert {\mathit{Z}}_{\mathbf{1}}-{\mathit{K}}_{\mathit{h}}*{\mathit{F}}^{\left(k\right)}-{\tilde{\mathit{Z}}}_{\mathbf{1}}^{\left(k\right)}\Vert}_{2}^{2}.\text{}$$

$${\mathit{Z}}_{\mathbf{2}}^{\left(k+1\right)}=\underset{{\mathit{Z}}_{\mathbf{2}}}{\mathrm{arg}\mathrm{min}}\mu \phi \left({\mathit{Z}}_{\mathbf{2}}\right)+\frac{{\beta}_{2}}{2}{\Vert {\mathit{Z}}_{\mathbf{2}}-{\mathit{K}}_{\mathit{v}}*{\mathit{F}}^{\left(k\right)}-{\tilde{\mathit{Z}}}_{\mathbf{2}}^{\left(k\right)}\Vert}_{2}^{2}.\text{}$$

$${\mathit{Z}}_{\mathbf{3}}^{\left(k+1\right)}=\underset{{\mathit{Z}}_{\mathbf{3}}}{\mathrm{arg}\mathrm{min}}{\Vert {\mathit{Z}}_{\mathbf{3}}\Vert}_{p}^{p}+\frac{{\beta}_{3}}{2}{\Vert {\mathit{Z}}_{\mathbf{3}}-{\mathit{F}}^{\left(k\right)}+\mathit{G}-{\tilde{\mathit{Z}}}_{\mathbf{3}}^{\left(k\right)}\Vert}_{2}^{2}.\text{}$$

$${\mathit{Z}}_{\mathbf{4}}^{\left(k+1\right)}=\underset{{\mathit{Z}}_{\mathbf{4}}}{\mathrm{arg}\mathrm{min}}\frac{{\beta}_{4}}{2}{\Vert {\mathit{Z}}_{\mathbf{4}}-{\mathit{F}}^{\left(k\right)}-{\tilde{\mathit{Z}}}_{\mathbf{4}}^{\left(k\right)}\Vert}_{2}^{2}.\text{}$$

Each of these four equations are solved below:

(1) The ${\mathit{Z}}_{\mathbf{1}}^{\left(k+1\right)}$ and the ${\mathit{Z}}_{\mathbf{2}}^{\left(k+1\right)}$ are solved by the majorization-minimization (MM) algorithm [33], which approximates the solution of the target problem by finding a well-behaving multi-variable auxiliary function and constructing an iterative sequence.

First, suppose a minimization optimization problem with the following form:
where α > 0 and φ (v) satisfies $\phi \left(v\right)={\displaystyle \sum _{i=1}^{N}{\displaystyle \sum _{j=1}^{N}{\Vert {\tilde{v}}_{i,j,N,N}\Vert}_{2}}}$. To avoid having to solve the complex minimization problem P(v) directly, a function Q(v, u) for which Q(v, u) ≥ P(v) for all $v,u$ could be constructed. The equality sign holds if and only if u = v. Thus, the optimal solution of P(v) is the minimum value of Q(v, u). Generally, an MM iterative algorithm for minimizing P(v) has the form:

$$\underset{v}{\mathrm{min}}P\left(v\right)=\left\{\frac{\alpha}{2}{\Vert v-{v}_{0}\Vert}_{2}^{2}+\phi \left(v\right)\right\},v\in {\mathbb{R}}^{{M}^{2}\times 1},\text{}$$

$${v}^{(n+1)}=\underset{v}{\mathrm{arg}\mathrm{min}}Q(v,{v}^{\left(n\right)}),$$

It can be solved step-by-step in the following way.

The properties of the function $\phi \left(v\right)={\displaystyle \sum _{i=1}^{N}{\displaystyle \sum _{j=1}^{N}{\Vert {\tilde{v}}_{i,j,N,N}\Vert}_{2}}}$ are first observed. It is known that the equality sign holds when u = v, as shown in Equation (19):
which means $\phi \left(v\right)$ can be solved by constructing the function S(v) of Equation (20):

$$\frac{1}{2}\left(\frac{1}{{\Vert u\Vert}_{2}}{\Vert v\Vert}_{2}^{2}+{\Vert u\Vert}_{2}\right)\ge {\Vert v\Vert}_{2}{}_{}^{}\uff0c\text{}$$

$$S\left(v,u\right)=\frac{1}{2}{\displaystyle \sum _{i=1}^{N}{\displaystyle \sum _{j=1}^{N}\left[\frac{1}{{\Vert {\tilde{u}}_{i,j,N,N}\Vert}_{2}}{\Vert {\tilde{v}}_{i,j,N,N}\Vert}_{2}^{2}+{\Vert {\tilde{u}}_{i,j,N,N}\Vert}_{2}\right]}}\ge \phi \left(v\right)={\displaystyle \sum _{i=1}^{N}{\displaystyle \sum _{j=1}^{N}{\Vert {\tilde{v}}_{i,j,N,N}\Vert}_{2}{}_{}^{}}}\uff0c.$$

After a simple calculation [31], $S\left(v,u\right)$ is rewritten as
to facilitate future calculations.

$$S\left(v,u\right)=\frac{1}{2}{\Vert \mathit{D}\left(u\right)v\Vert}_{2}^{2}+C{\left(u\right)}_{}^{}\uff0c\text{}$$

$C\left(u\right)$ in the above equation is independent of $v$, and $\mathit{D}\left(u\right)\in {\mathbb{R}}^{{M}^{2}\times {M}^{2}}$ is a diagonal matrix with its elements defined as

$${\left[\mathit{D}\left(u\right)\right]}_{m,m}=\sqrt{{\displaystyle \sum _{i=-{N}_{l}}^{{N}_{r}}{\displaystyle \sum _{j=-{N}_{l}}^{{N}_{r}}{\left[{\displaystyle \sum _{{k}_{1}=-{N}_{l}}^{{N}_{r}}{\displaystyle \sum _{{k}_{2}=-{N}_{l}}^{{N}_{r}}{\left|{u}_{m-i+{k}_{1},}{}_{m-\mathit{i}+{k}_{2}}\right|}^{2}}}\right]}^{-\frac{1}{2}}}}}\begin{array}{cc}& \end{array}{\left(m=1,2,\cdots ,{M}^{2}\right)}_{}^{}\uff0c\text{}$$

The entries of $\mathit{D}$ can be easily computed by using MATLAB built-in function “conv2”. Putting Equations (17), (21), and (22) together, the optimization problem $P\left(v\right)$ can be written as

$$Q\left(v,u\right)=\frac{\alpha}{2}{\Vert v-{v}_{0}\Vert}_{2}^{2}+S\left(v,u\right)=\frac{\alpha}{2}{\Vert v-{v}_{0}\Vert}_{2}^{2}+\frac{1}{2}{\Vert \mathit{D}\left(u\right)v\Vert}_{2}^{2}+C\left(u\right).\text{}$$

When $v=u$, $Q\left(u,u\right)=P\left(u\right)$, to minimize $P\left(v\right)$, the MM aims to iteratively solve
with the solution
where $\mathit{I}\in {\mathbb{R}}^{{M}^{2}\times {M}^{2}}$ is an identity matrix with the same size of ${\mathit{D}}^{2}\left({v}^{\left(n\right)}\right)$, ${\mathit{D}}^{2}\left({v}^{\left(n\right)}\right)$ is also a diagonal matrix which has the same form of Equation (22).

$${v}^{\left(n+1\right)}=\underset{v}{\mathrm{arg}\mathrm{min}}\frac{\alpha}{2}{\Vert v-{v}_{0}\Vert}_{2}^{2}+\frac{1}{2}{\Vert \mathit{D}\left({v}^{\left(n\right)}\right)v\Vert}_{2}^{2}{}_{}^{},n=1,2,3\dots \text{}$$

$${v}^{\left(n+1\right)}={\left(\mathit{I}+\frac{1}{\alpha}{\mathit{D}}^{2}\left({v}^{\left(n\right)}\right)\right)}^{-1}{v}_{0}{}_{}^{},n=1,2,3\dots ,\text{}$$

Observing functions of Equations (13) and (14), the subproblem ${\mathit{Z}}_{\mathbf{1}},{\mathit{Z}}_{\mathbf{2}}$ conforms to the function in Equation (17) and can be solved iteratively using Equation (26). ${\mathit{Z}}_{\mathit{i}\left(n+1\right)}^{\left(k+1\right)}\left(i=1,2\right)$ represents the $\left(n\right)$-th iteration of the MM algorithm in the $\left(k+1\right)$-th outer loop.
where $\mathit{m}\mathit{a}\mathit{t}$ plays the role of reshaping a vector to a matrix and ${\mathfrak{z}}_{\mathit{i}\left(0\right)}^{\left(k+1\right)}$ is the vector form of ${\mathit{Z}}_{\mathit{i}\left(0\right)}^{\left(k+1\right)}$.

$${\mathit{Z}}_{\mathit{i}\left(n+1\right)}^{\left(k+1\right)}=\mathit{m}\mathit{a}\mathit{t}\left\{{\left[\mathit{I}+\frac{\mu}{{\beta}_{i}}{\mathit{D}}^{2}\left({\mathit{Z}}_{\mathit{i}\left(n\right)}^{\left(k+1\right)}\right)\right]}^{-1}{\mathfrak{z}}_{\mathit{i}\left(0\right)}^{\left(k+1\right)}\right\},\left(i=1,2\right),\text{}$$

The initial values of ${\mathit{Z}}_{\mathbf{1}\left(\mathbf{0}\right)}^{\left(k+1\right)}\uff0c{\mathit{Z}}_{\mathbf{2}\left(\mathbf{0}\right)}^{\left(k+1\right)}$ in the above equation are:

$$\{\begin{array}{l}{\mathit{Z}}_{\mathbf{1}\left(0\right)}^{\left(k+1\right)}={\mathit{K}}_{\mathit{h}}*{\mathit{F}}^{\left(k\right)}+{\tilde{\mathit{Z}}}_{\mathbf{1}}^{\left(k\right)}\\ {\mathit{Z}}_{\mathbf{2}\left(0\right)}^{\left(k+1\right)}={\mathit{K}}_{\mathit{v}}*{\mathit{F}}^{\left(k\right)}+{\tilde{\mathit{Z}}}_{\mathbf{2}}^{\left(k\right)}\end{array}.\text{}$$

(2) ${\mathit{Z}}_{\mathbf{3}}^{\left(k+1\right)}$ could be solved by the famous soft thresholding shrinkage method [45] as shown below:

$$\begin{array}{l}{\mathit{Z}}_{\mathbf{3}}^{\left(k+1\right)}=\underset{{\mathit{Z}}_{\mathbf{3}}}{\mathrm{arg}\mathrm{min}}{\Vert {\mathit{Z}}_{\mathbf{3}}\Vert}_{p}^{p}+\frac{{\beta}_{3}}{2}{\Vert {\mathit{Z}}_{\mathbf{3}}-{\mathit{F}}^{\left(k\right)}+\mathit{G}-{\tilde{\mathit{Z}}}_{\mathbf{3}}^{\left(k\right)}\Vert}_{2}^{2}\\ =shrinkage({\mathit{F}}^{\left(k\right)}-\mathit{G}+{\tilde{\mathit{Z}}}_{\mathbf{3}}^{\left(k\right)},\frac{1}{{\beta}_{3}},p)\\ =\mathrm{sign}\left({\mathit{F}}^{\left(k\right)}-\mathit{G}+{\tilde{\mathit{Z}}}_{\mathbf{3}}^{\left(k\right)}\right)\\ .*\mathrm{max}\left(\mathrm{abs}\left({\mathit{F}}^{\left(k\right)}-\mathit{G}+{\tilde{\mathit{Z}}}_{\mathbf{3}}^{\left(k\right)}\right)-\left(\frac{1}{{\beta}_{3}}\right).^\left(p-2\right).*\mathrm{abs}\left({\mathit{F}}^{\left(k\right)}-\mathit{G}+{\tilde{\mathit{Z}}}_{\mathbf{3}}^{\left(k\right)}\right).^\left(p-1\right),0\right).\end{array}$$

(3) The ${\mathit{Z}}_{\mathbf{4}}^{\left(k+1\right)}$ subproblem is computed as

$${\mathit{Z}}_{\mathbf{4}}^{\left(k+1\right)}=\underset{{\mathit{Z}}_{\mathbf{4}}}{\mathrm{arg}\mathrm{min}}\frac{{\beta}_{4}}{2}{\Vert {\mathit{Z}}_{\mathbf{4}}-{\mathit{F}}^{\left(k\right)}-{\tilde{\mathit{Z}}}_{\mathbf{4}}^{\left(k\right)}\Vert}_{2}^{2}=\mathrm{min}\left(1,\mathrm{max}\left({\mathit{F}}^{\left(k\right)}+{\tilde{\mathit{Z}}}_{\mathbf{4}}^{\left(k\right)},0\right)\right),.$$

(4) The ${\mathit{F}}^{\left(k+1\right)}$ subproblem is solved by substituting ${\mathit{Z}}_{\mathbf{1}}^{\left(k+1\right)},{\mathit{Z}}_{\mathbf{2}}^{\left(k+1\right)},{\mathit{Z}}_{\mathbf{3}}^{\left(k+1\right)},{\mathit{Z}}_{\mathbf{4}}^{\left(k+1\right)}$ into Equation (12) and calculating for the variable ${\mathit{F}}^{\left(k+1\right)}$. Under the assumption of periodic boundary conditions, fast Fourier transform is applied to both sides of the equation to perform the computation in the frequency domain instead of a spatial domain, in order to reduce the computational complexity caused by matrix multiplication. Matrix multiplication is converted to dot product operation, in other words, solving the following normal equation:
where $\overline{\mathit{x}}$ is the frequency-domain representation of $\mathit{x}$, “$.*$” stands for dot product, “*“ is the conjugate, $\mathbf{1}$ is the matrix whose entries are all 1, and $\mathcal{F}$ represents the two-dimensional fast Fourier transform (FFT). Rearranging of the equation gives ${\mathit{F}}^{\left(k+1\right)}$ as

$$\begin{array}{l}\left({\beta}_{1}{\overline{\mathit{K}}}_{\mathit{h}}^{*}.*{\overline{\mathit{K}}}_{\mathit{h}}+{\beta}_{2}{\overline{\mathit{K}}}_{\mathit{v}}^{*}.*{\overline{\mathit{K}}}_{\mathit{v}}+{\beta}_{3}\mathbf{1}+{\beta}_{4}\mathbf{1}\right).*{\overline{\mathit{F}}}^{\left(k+1\right)}\\ ={\beta}_{1}{\overline{\mathit{K}}}_{\mathit{h}}^{*}.*\left({\overline{\mathit{Z}}}_{\mathbf{1}}^{\left(k+1\right)}-{\overline{\tilde{\mathit{Z}}}}_{\mathbf{1}}^{\left(k\right)}\right)+{\beta}_{2}{\overline{\mathit{K}}}_{\mathit{v}}^{*}.*\left({\overline{\mathit{Z}}}_{\mathbf{2}}^{\left(k+1\right)}-{\overline{\tilde{\mathit{Z}}}}_{\mathbf{2}}^{\left(k\right)}\right)\\ +{\beta}_{3}\left({\overline{\mathit{Z}}}_{\mathbf{3}}^{\left(k+1\right)}+\overline{\mathit{G}}-{\overline{\tilde{\mathit{Z}}}}_{\mathbf{3}}^{\left(k\right)}\right)+{\beta}_{4}\left({\overline{\mathit{Z}}}_{\mathbf{4}}^{\left(k+1\right)}-{\overline{\tilde{\mathit{Z}}}}_{\mathbf{4}}^{\left(k\right)}\right)\uff0c\end{array}$$

$$\begin{array}{l}{\mathit{F}}^{\left(k+1\right)}=\\ {\mathcal{F}}^{-1}\left(\frac{{\beta}_{1}{\overline{\mathit{K}}}_{\mathit{h}}^{*}.*\left({\overline{\mathit{Z}}}_{\mathbf{1}}^{\left(k+1\right)}-{\overline{\tilde{\mathit{Z}}}}_{\mathbf{1}}^{\left(k\right)}\right)+{\beta}_{2}{\overline{\mathit{K}}}_{\mathit{v}}^{*}.*\left({\overline{\mathit{Z}}}_{\mathbf{2}}^{\left(k+1\right)}-{\overline{\tilde{\mathit{Z}}}}_{\mathbf{2}}^{\left(k\right)}\right)+{\beta}_{3}\left({\overline{\mathit{Z}}}_{\mathbf{3}}^{\left(k+1\right)}+\overline{\mathit{G}}-{\overline{\tilde{\mathit{Z}}}}_{\mathbf{3}}^{\left(k\right)}\right)+{\beta}_{4}\left({\overline{\mathit{Z}}}_{\mathbf{4}}^{\left(k+1\right)}-{\overline{\tilde{\mathit{Z}}}}_{\mathbf{4}}^{\left(k\right)}\right)}{{\beta}_{1}{\overline{\mathit{K}}}_{\mathit{h}}^{*}\circ {\overline{\mathit{K}}}_{\mathit{h}}+{\beta}_{2}{\overline{\mathit{K}}}_{\mathit{v}}^{*}\circ {\overline{\mathit{K}}}_{\mathit{v}}+{\beta}_{3}\mathbf{1}+{\beta}_{4}\mathbf{1}}\right).\end{array}$$

(5) The dual variables ${\tilde{\mathit{Z}}}_{\mathbf{1}},{\tilde{\mathit{Z}}}_{\mathbf{2}},{\tilde{\mathit{Z}}}_{\mathbf{3}},{\tilde{\mathit{Z}}}_{\mathbf{4}}$ could be updated via the gradient ascent method.

$$\{\begin{array}{l}{\tilde{\mathit{Z}}}_{\mathbf{1}}^{\left(k+1\right)}={\tilde{\mathit{Z}}}_{\mathbf{1}}^{\left(k\right)}+\gamma {\beta}_{1}({\mathit{K}}_{\mathit{h}}*{\mathit{F}}^{\left(k+1\right)}-{\mathit{Z}}_{\mathbf{1}}^{\left(k+1\right)})\\ {\tilde{\mathit{Z}}}_{\mathbf{2}}^{\left(k+1\right)}={\tilde{\mathit{Z}}}_{\mathbf{2}}^{\left(k\right)}+\gamma {\beta}_{2}({\mathit{K}}_{\mathit{v}}*{\mathit{F}}^{\left(k+1\right)}-{\mathit{Z}}_{\mathbf{2}}^{\left(k+1\right)})\\ {\tilde{\mathit{Z}}}_{\mathbf{3}}^{\left(k+1\right)}={\tilde{\mathit{Z}}}_{\mathbf{3}}^{\left(k\right)}+\gamma {\beta}_{3}({\mathit{F}}^{\left(k+1\right)}-\mathit{G}-{\mathit{Z}}_{\mathbf{3}}{}^{\left(k+1\right)})\\ {\tilde{\mathit{Z}}}_{\mathbf{4}}^{\left(k+1\right)}={\tilde{\mathit{Z}}}_{\mathbf{4}}^{\left(k\right)}+\gamma {\beta}_{4}({\mathit{F}}^{\left(k+1\right)}-{\mathit{Z}}_{\mathbf{4}}^{\left(k+1\right)})\end{array}.\text{}$$

#### 4.2. OGS-Lp-FAST Model

The denoising models based on OGS is more time consuming than the TV-based model. This is mainly because that OGS model considers the gradient information of the neighborhood in a reconstructed image, thus making the computation more complex. Thus, this is a shortcoming.

Goldstein et al. [12] proposed an accelerated ADMM algorithm with a restart that improves the convergence rate of the ADMM algorithm from $O(\frac{1}{k})$ to $O(\frac{1}{{k}^{2}})$. Inspired by them, we adopted this algorithm to improve the OGS-Lp model. This modified model is named OGS-Lp-FAST (“Ours” for short). The auxiliary variables ${\mathit{U}}_{\mathit{i}}(i=1,2)$ and ${\tilde{\mathit{U}}}_{\mathit{i}}(i=1,2)$ are first adopted. Under this framework, ${\mathit{Z}}_{\mathit{i}}{}^{}(i=1,2,3,4)$ is updated in the following way:

$$\{\begin{array}{l}{\mathit{Z}}_{\mathbf{1}\left(n+1\right)}^{\left(k+1\right)}=\mathit{m}\mathit{a}\mathit{t}\left\{{\left[\mathit{I}+\frac{\mu}{{\beta}_{1}}{\mathit{D}}^{2}{\left({v}^{\left(n\right)}\right)}^{2}\left({\mathit{Z}}_{\mathbf{1}\left(n+1\right)}^{\left(k+1\right)}\right)\right]}^{-1}{\mathfrak{z}}_{\mathbf{1}\left(0\right)}^{\left(k+1\right)}\right\}\\ {\mathit{Z}}_{\mathbf{2}\left(n+1\right)}^{\left(k+1\right)}=\mathit{m}\mathit{a}\mathit{t}\left\{{\left[\mathit{I}+\frac{\mu}{{\beta}_{2}}{\mathit{D}}^{2}\left({v}^{\left(n\right)}\right)\left({\mathit{Z}}_{\mathbf{2}\left(n+1\right)}^{\left(k+1\right)}\right)\right]}^{-1}{\mathfrak{z}}_{\mathbf{2}\left(0\right)}^{\left(k+1\right)}\right\}\\ {\mathit{Z}}_{\mathbf{3}}^{\left(k+1\right)}=shrinkage({\mathit{F}}^{\left(k\right)}-\mathit{G}+{\tilde{\mathit{Z}}}_{\mathbf{3}}^{\left(k\right)},\frac{1}{{\beta}_{3}},p)\\ {\mathit{Z}}_{\mathbf{4}}^{\left(k+1\right)}=min\left(1,max\left({\mathit{F}}^{\left(k\right)}+{\tilde{\mathit{Z}}}_{\mathbf{4}}^{\left(k\right)},0\right)\right),\end{array}.\text{}$$

The initial values of ${\mathit{Z}}_{\mathbf{1}\left(0\right)}^{\left(k+1\right)}\uff0c{\mathit{Z}}_{\mathbf{2}\left(0\right)}^{\left(k+1\right)}$ in the above equation are:

$$\{\begin{array}{l}{\mathit{Z}}_{\mathbf{1}\left(0\right)}^{\left(k+1\right)}={\mathit{K}}_{\mathit{h}}*{\mathit{F}}^{\left(k\right)}+{\tilde{\mathit{U}}}_{1}^{\left(k\right)}\\ {\mathit{Z}}_{\mathbf{2}\left(0\right)}^{\left(k+1\right)}={\mathit{K}}_{\mathit{v}}*{\mathit{F}}^{\left(k\right)}+{\tilde{\mathit{U}}}_{2}^{\left(k\right)}\end{array}.\text{}$$

The dual variable ${\tilde{\mathit{Z}}}_{\mathit{i}}(i=1,2)$ can be updated as follows:

$$\{\begin{array}{l}{\tilde{\mathit{Z}}}_{\mathbf{1}}^{\left(k+1\right)}={\tilde{\mathit{U}}}_{\mathbf{1}}^{\left(k\right)}+\gamma {\beta}_{1}\left({\mathit{K}}_{\mathit{h}}*{\mathit{F}}^{\left(k\right)}-{\mathit{Z}}_{\mathbf{1}}^{\left(k+1\right)}\right)\\ {\tilde{\mathit{Z}}}_{\mathbf{2}}^{\left(k+1\right)}={\tilde{\mathit{U}}}_{\mathbf{2}}^{\left(k\right)}+\gamma {\beta}_{2}\left({\mathit{K}}_{\mathit{v}}*{\mathit{F}}^{\left(k\right)}-{\mathit{Z}}_{\mathbf{2}}^{\left(k+1\right)}\right)\\ {\tilde{\mathit{Z}}}_{\mathbf{3}}^{\left(k+1\right)}={\tilde{\mathit{Z}}}_{\mathbf{3}}^{\left(k\right)}+\gamma {\beta}_{3}\left({\mathit{F}}^{\left(k\right)}-\mathit{G}-{\mathit{Z}}_{\mathbf{3}}{}^{\left(k+1\right)}\right)\\ {\tilde{\mathit{Z}}}_{\mathbf{4}}^{\left(k+1\right)}={\tilde{\mathit{Z}}}_{\mathbf{4}}^{\left(k\right)}+\gamma {\beta}_{4}\left({\mathit{F}}^{\left(k\right)}-{\mathit{Z}}_{\mathbf{4}}^{\left(k+1\right)}\right)\end{array}.\text{}$$

As image denoising is not a strong convex problem, the iteration needs to be restarted to ensure the convergence of the algorithm. When Equation (33) is not satisfied, the algorithm is restarted.
where ${c}_{i}^{(k)}={\beta}^{-1}{\Vert {\tilde{\mathit{Z}}}_{\mathit{i}}^{(k)}-{\tilde{\mathit{U}}}_{\mathit{i}}^{(k)}\Vert}_{2}^{2}+\beta {\Vert {\mathit{Z}}_{\mathit{i}}^{(k)}-{\mathit{U}}_{\mathit{i}}^{(k)}\Vert}_{2}^{2}$ is the sum of the $(k)-th$ primal residuals $\beta {\Vert {\mathit{Z}}_{\mathit{i}}^{(k)}-{\mathit{U}}_{\mathit{i}}^{(k)}\Vert}_{2}^{2}$ and the dual residuals ${\beta}^{-1}{\Vert {\tilde{\mathit{Z}}}_{\mathit{i}}^{(k)}-{\tilde{\mathit{U}}}_{\mathit{i}}^{(k)}\Vert}_{2}^{2}$, and $\eta $ is a number close to 1. To prevent frequent restarts, $\eta =0.97$ is set.

$${c}_{i}^{(k)}<\eta {c}_{i}^{(k-1)}(i=1,2),\text{}$$

When ${c}_{i}^{(k)}<\eta {c}_{i}^{(k-1)}$, the acceleration step size is set to ${\epsilon}_{i}^{}$, and the auxiliary variables ${\mathit{U}}_{\mathit{i}}(i=1,2)$ and ${\tilde{\mathit{U}}}_{i}^{(k+1)}$ $(i=1,2)$ are updated as follows:
which is updated according to the equations below upon restart:

$$\{\begin{array}{l}{\epsilon}_{i}^{(k+1)}=\frac{1+\sqrt{1+4{({\epsilon}_{i}^{(k)})}^{2}}}{2}(i=1,2)\\ {\mathit{U}}_{\mathit{i}}^{(k+1)}={\mathit{Z}}_{\mathit{i}}^{(k+1)}+\frac{{\epsilon}_{i}^{(k)}-1}{{\epsilon}_{i}^{(k+1)}}({\mathit{Z}}_{\mathit{i}}^{(k+1)}-{\mathit{Z}}_{\mathit{i}}^{(k)})(i=1,2)\\ {\tilde{\mathit{U}}}_{\mathit{i}}^{(k+1)}={\tilde{\mathit{Z}}}_{\mathit{i}}^{(k+1)}+\frac{{\epsilon}_{i}^{(k)}-1}{{\epsilon}_{i}^{(k+1)}}({\tilde{\mathit{Z}}}_{\mathit{i}}^{(k+1)}-{\tilde{\mathit{Z}}}_{\mathit{i}}^{(k)})(i=1,2)\end{array}.\text{}$$

$$\{\begin{array}{l}{\epsilon}_{i}^{(k+1)}=1(i=1,2)\\ {\mathit{U}}_{\mathit{i}}^{(k+1)}={\mathit{Z}}_{\mathit{i}}^{(k+1)}{}^{}(i=1,2)\\ {\tilde{\mathit{U}}}_{\mathit{i}}^{(k+1)}={\tilde{\mathit{Z}}}_{\mathit{i}}^{(k+1)}{}^{}(i=1,2)\\ {c}_{i}^{(k+1)}={\eta}^{-1}{c}_{i}^{(k)}{}^{}(i=1,2)\end{array}.\text{}$$

Up to this point, all subproblems of the proposed model are solved. The OGS-Lp-FAST algorithm just described is summarized as Algorithm 1.

The $\mathit{F}$ sub-problem should be updated as

$$\begin{array}{l}{\mathit{F}}^{\left(k+1\right)}=\\ {\mathcal{F}}^{-1}\left(\frac{{\beta}_{1}{\overline{\mathit{K}}}_{\mathit{h}}^{*}.*\left({\overline{\mathit{U}}}_{\mathbf{1}}^{\left(k+1\right)}-{\overline{\tilde{\mathit{U}}}}_{\mathbf{1}}^{\left(k+1\right)}\right)+{\beta}_{2}{\overline{\mathit{K}}}_{\mathit{v}}^{*}.*\left({\overline{\mathit{U}}}_{\mathbf{2}}^{\left(k+1\right)}-{\overline{\tilde{\mathit{U}}}}_{\mathbf{2}}^{\left(k+1\right)}\right)+{\beta}_{3}\left({\overline{\mathit{Z}}}_{\mathbf{3}}^{\left(k+1\right)}+\overline{\mathit{G}}-{\overline{\tilde{\mathit{Z}}}}_{\mathbf{3}}^{\left(k+1\right)}\right)+{\beta}_{4}\left({\overline{\mathit{Z}}}_{\mathbf{4}}^{\left(k+1\right)}-{\overline{\tilde{\mathit{Z}}}}_{\mathbf{4}}^{\left(k+1\right)}\right)}{{\beta}_{1}{\overline{\mathit{K}}}_{\mathit{h}}^{*}\circ {\overline{\mathit{K}}}_{\mathit{h}}+{\beta}_{2}{\overline{\mathit{K}}}_{\mathit{v}}^{*}\circ {\overline{\mathit{K}}}_{\mathit{v}}+{\beta}_{3}\mathbf{1}+{\beta}_{4}\mathbf{1}}\right).\end{array}$$

Algorithm 1 OGS-Lp-FAST pseudo-code |

Input: image G with noiseOutput: denoised image FInitialize: $\begin{array}{l}k=1{,}^{}n=0{,}^{}{\mathit{Z}}_{\mathit{i}}^{(k)}=\mathbf{0}{,}^{}{\tilde{\mathit{Z}}}_{i}^{(k)}={\mathbf{0}}^{}(\mathit{i}=1,2,\cdots ,4){,}^{}{\beta}_{1}{,}^{}{\beta}_{2}{,}^{}{\beta}_{3}{,}^{}{\beta}_{4}{,}^{}{\mu}^{},{p}^{},{\eta}^{},\\ {\mathit{U}}_{\mathit{j}}^{(k)}=\mathbf{0}{,}^{}{\tilde{\mathit{U}}}_{\mathit{j}}^{(k)}={\mathbf{0}}^{}(j=1,2),\gamma {,}^{}tol{,}^{}{c}^{1}=inf{,}^{}{\mathit{F}}_{}^{(k)}={\mathit{G}}^{},Max\end{array}$ 1: for $k=1:Max$ 2: If : ${c}_{i}^{(k+1)}<\eta {c}_{i}^{(k)}:$ 3: ${\mathit{Z}}_{\mathit{i}}^{(k+1)}{}^{}{,}^{}(i=1,2,3,4){,}^{}$ are updated with Equations (33) and (34) 4: ${\tilde{\mathit{Z}}}_{\mathit{i}}^{(k+1)}{}^{}(i=1,2,3,4)$ are updated with Equation (35)5: ${\epsilon}_{i}^{(k+1)},{\mathit{U}}_{\mathit{j}}^{(k+1)}{,}^{}{\tilde{\mathit{U}}}_{\mathit{j}}^{(k+1)}\left(j=1,2\right)$ is updated with Equation (37)6: ${\mathit{F}}^{(k+1)}$ are updated with Equation (39)7: $E={\Vert {\mathit{F}}^{(k+1)}-{\mathit{F}}^{(k)}\Vert}_{2}/{\Vert {\mathit{F}}^{(k)}\Vert}_{2}$ 8 Else 9: Restart as in Equation (38)10: End If 11: If E < tol Break; 12: End For 13: Return F^{(k)} as F |

## 5. Experimental Results and Analyses

In this section, eight typical grayscale images with a size of $256\times 256$ pixels are chosen to validate the denoising performance of the Lp-OSG-TV-FAST method. The test images are as shown in Figure 3. The image “House” is downloaded from http://sipi.usc.edu/database/database.php?volume%92=%92misc&image=5top. The images “Lena“ and “Pepper” are from http://decsai.ugr.es/cvg/dbimagenes/. The images “Woman”, “Girl”, and “Reagan” are from http://www.hlevkin.com/default.html#testimages. The images “Milk drop” and “Shoulder” are from http://www.cs.cmu.edu/~cil/v-images.html. The versions of the images in our paper are other special formats which are converted by Photoshop from the sources above.

The method proposed here is compared with ATV, ITV, TGV, OSG-L1, and OGS-Lp methods, and is evaluated objectively in terms of the peak signal-to-noise ratio (PSNR), structural similarity (SSIM), runtime, and other experimental indicators [45]. Simulations are performed on MATLAB R2014a platform running in a hardware environment of Inter(R) Core (TM) [email protected] CPU and 16 GB memory.

#### 5.1. Evaluation Method

In the denoising field, the common evaluating criteria including PSNR, SSIM, and runtime. The PSNR and SSIM [46] are defined in Equations (40) and (41):
where $\mathit{X}$ denotes the original image, $\mathit{Y}$ is the reconstructed image, and $MAX(\mathit{X})$ represents the largest gray value in the original image.
where ${u}_{\mathit{X}}$ is the mean of $\mathit{X}$; ${u}_{\mathit{Y}}$ is the mean of $\mathit{Y}$; ${\sigma}_{\mathit{X}}^{2}$ is the variance of $\mathit{X}$; ${\sigma}_{\mathit{Y}}^{2}$ is the variance of $\mathit{Y}$; ${\sigma}_{\mathit{X}\mathit{Y}}$ is the covariance between $\mathit{X}$ and $\mathit{Y}$; and $L=512$, ${k}_{1}=0.05$, ${k}_{2}=0.05$. The parameter $L=255$.

$$PSNR(\mathit{X},\mathit{Y})=10\mathrm{lg}\frac{{(MAX(\mathit{X}))}^{2}}{\frac{1}{{N}^{2}}{\displaystyle \sum _{i=1}^{N}{\displaystyle \sum _{j=1}^{N}{({\mathit{X}}_{ij}-{\mathit{Y}}_{ij})}^{2}}}}\uff0c\text{}$$

$$SSIM(\mathit{X},\mathit{Y})=\frac{(2{u}_{\mathit{X}}{u}_{\mathit{Y}}+{(L{k}_{1})}^{2})(2{\sigma}_{\mathit{X}\mathit{Y}}+{(L{k}_{2})}^{2})}{({u}_{\mathit{X}}^{2}+{u}_{\mathit{Y}}^{2}+{(L{k}_{1})}^{2})({\sigma}_{\mathit{X}}^{2}+{\sigma}_{\mathit{Y}}^{2}+{(L{k}_{2})}^{2})}\uff0c\text{}$$

#### 5.2. Sensitivity of the Parameters

In this section, an important parameter of the proposed algorithm, the group size K, is tested and compared to evaluate its overall effect on the algorithm. PSNR and SSIM are used as criteria to evaluate the algorithm objectively. Three images (“Girl”, “House”, and “Lena”) with 30% noise level are selected, on which K is made to vary continuously from 1 to 10. The other parameters are adjusted to the optimum. PSNR and SSIM values are recorded and plotted into graphs, as shown in Figure 4 and Figure 5. In Figure 4, PSNR and SSIM increase with the increase of K and reach their maximum at K = 5. Further increases in K leads to decreased PSNR values. Thus, the neighborhood information of an image has a positive impact on the performance of the algorithm. With K set to appropriate values, the edge information of the image is better preserved, and the noise resistance is improved. However, K should not be too large either, or nearby regions with drastic pixel changes could be taken, which will result in decreased PSNR and SSIM.

Then, we tested how to select a good regularization parameter $\mu $ for different images. We started with the value of $\mu $ from a low value then increased the parameters empirically as the level of noise improved to get the best visual effect. For example, for the “Girl” image corrupted by impulse noise from 20% to 50%, $\mu $ = 0.14, 0.15, 0.15, 0.18, respectively.

In the selection of the parameter $p$, the $p$ value is set between 0 and 1. On the premise of fixing other parameters, we increase $p$ by 0.1 step. After several rounds of experiments, we select the optimal when the image gets the best visual effect.

The optimal parameters for different images with the noise level from 20% to 50% are given in Table 1.

#### 5.3. Testing and Comparing the Denoising Performance of Different Algorithms

Six images are selected from the original images of Figure 3 for testing, on which impulse noise at levels from 20% to 50%, to compare the denoising effects of ATV, ITV, TGV, OGS-L1, OGS-Lp, and Ours algorithms (six in total). To ensure the objectiveness and fairness of the evaluation, the above algorithms all adopt the following iterative condition:

$${\Vert {\mathit{F}}^{(k+1)}-{\mathit{F}}^{(k)}\Vert}_{2}\cdot {\Vert {\mathit{F}}^{(k)}\Vert}_{2}^{-1}<1{0}^{-4}.\text{}$$

Regularization parameters of these algorithms are adjusted to ensure the best denoising effect of each, which ensured the fairness of the test. For methods based on the OGS model, the group size is set to K = 5. The test results on different images are given in Table 2, Table 3, Table 4 and Table 5. The best indicator values are labeled as black and bold. By observing the data in each table, the following conclusions could be made:

- With the introduction of different levels of noise to the images, our model generates higher PSNR and SSIM values for the reconstructed images than other methods, indicating its superior denoising effect. The recovered images also resemble the original ones more.
- The proposed model works better at lower noise levels. For example, at a 20% noise level, as shown in Table 2, the PSNR value of the “House” image (37.72 dB) given by our model is 5.91 dB higher than that given by the ITV model (31.81 dB) and 5.4 dB higher than that of the TGV model (32.32 dB). Even at high noise levels, our model still performs better than the others, which shows the clear advantages that total variation with overlapping group sparsity has over the classic anisotropic TV model.
- Compared to OGS-L1, our proposed method incorporates the Lp-pseudo-norm shrinkage, which adds another degree of freedom to the algorithm and improves the depiction of the gradient-domain sparsity of the images, achieving a better denoising effect. For example, at a 20% noise level, as shown in Table 2, the PSNR value of the “Girl” image (32.34 dB) given by our model is 1.67 dB higher than that given by the OGS-L1 model (30.67 dB). Even at a noise level of 50%, as shown in Table 5, the PSNR value of the “Girl” image (27.35 dB) given by our model is still 0.90 dB higher than that given by the OGS-L1 model (26.45 dB). This proves that the Lp-pseudo-norm is more suitable as a regularizer for describing the sparsity of images than the L1-norm.
- In terms of the runtime of the six models, the OGS-based method is more time consuming than ATV, ITV, and TGV. This is mainly because the OGS model considers the gradient information of the neighborhood in an image undergoing reconstruction, thus making the computation more complex.
- Comparing the values of PSNR and SSIM in Table 2, Table 3, Table 4 and Table 5, OGS-Lp-FAST and OGS-Lp have the same denoising effect. However, by observing the value of runtime of all testing images, we find that convergence is sped up in the OGS-LP-FAST method with the use of accelerated ADMM with a restart. For example, at a 20% noise level, as shown in Table 2, the time value of the “Woman” image (8.69 s) given by the OGS-Lp-FAST model is 7.53 s less than that given by the OGS-L1 model (16.22 s).

To verify our proposed method further, we compared the image details restored by different algorithms. Figure 6 shows four images (“Woman”, “Pepper”, “Girl”, and “House”) with impulse noise from 20% to 50%. The enlarged details of the images restored by ATV, ITV, TGV, OGS-L1, and OGS-Lp-FAST algorithms are displayed for comparison.

In terms of the visual effect of the restored images, ATV denoising produces apparent blocking artifacts in the images. In the “Pepper” image ATV recovers, we can easily find two heavy noise pixels.

The ITV method also shows significant staircase effect. In the “House” image it restores, the edge of the image is not well preserved when the noise pollution is high.

For the denoising results of TGV, the blocking artifacts in the images are sufficiently suppressed, but local heavy noise spots are still observable.

Finally, by comparing the four images, we can easily see the visual improvement in the images by using our method. Even at high noise pollution, our method protects the edge information of the image very well, and at the same time, avoids the staircase effect.

## 6. Discussion and Conclusions

In this work, we study a new regularization model by applying TV with OGS and Lp-pseudo-norm shrinkage for the image polluting under impulse noise. We provided the efficient algorithm OGS-Lp-FAST under the ADMM framework. This algorithm is rooted in overlapping group sparsity-based regularization, and incorporated the comparisons made with ATV, ITV, TGV, OGS-L1, and OGS-Lp models for validation of our proposed method. The following conclusions are drawn from the experimental results:

- An overlapping group sparsity (OGS)-based regularizer is used to replace the anisotropic total variation (ATV), to describe the prior conditions of the image. OGS makes full use of the similarity among image neighborhoods and the dissimilarity in the surroundings of each point. It promotes the distinction between the smooth and edge regions of an image, thus enhancing the robustness of the proposed model.
- Lp-pseudo-norm shrinkage is used in place of the L1-norm regularization to describe the fidelity term of images with salt and pepper noise. With the inclusion of another degree of freedom, Lp-pseudo-norm shrinkage reflects the sparsity of the image better and greatly improves the denoising performance of the algorithm.
- The difference operator is used for convolution. Under the ADMM framework, the complex model is transformed into a series of simpler mathematical problems to solve.
- Appropriate K values could effectively improve the overall denoising performance of the model. In practice, this parameter needs to be adjusted. If it is too small, the neighborhood information is not utilized completely. If the value is too big, too many dissimilar pixel blocks will be included, impairing the denoising result.
- The adoption of accelerated ADMM with a restart accelerates the convergence of the algorithm. The running time is reduced.
- In this paper, we focus on impulse noise removal, but the model is also applicable to other types of noise removal that we will further study in future work.

## Author Contributions

Funding acquisition, F.Y.; Methodology, L.W., Y.C. (Yingpin Chen) and F.L.; Software, L.W., Y.C. (Yingpin Chen) and Y.C. (Yuqun Chen) and Z.C.; Writing—original draft, L.W.; Writing—review & editing, Y.C. (Yingpin Chen).

## Funding

This work is supported by the Foundation of Fujian Province Great Teaching Reform [FBJG20180015], the Education and Scientific Research Foundation of Education Department of Fujian Province for Middle-aged and Young Teachers [grant number JT180309],the Education and Scientific Research Foundation of Education Department of Fujian Province for Middle-aged and Young Teachers [grant number JAT170352], the Foundation of Department of Education of Guangdong Province [2017KCXTD015], and the Open Foundation of Digital Signal and Image Processing Key Laboratory of Guangdong Province [grant number 2017GDDSIPL-01], the Education and Scientific Research Foundation of Education Department of Fujian Province for Middle-aged and Young Teachers [grant number JT180310], the Education and Scientific Research Foundation of Education Department of Fujian Province for Middle-aged and Young Teachers [grant number JT180311].

## Conflicts of Interest

The authors declare no conflict of interest.

## References

- He, W.; Zhang, H.; Zhang, L.; Shen, H. Total-Variation-Regularized Low-Rank Matrix Factorization for Hyperspectral Image Restoration. IEEE Trans. Geosci. Remote Sens.
**2015**, 54, 178–188. [Google Scholar] [CrossRef] - Cevher, V.; Sankaranarayanan, A.; Duarte, M.F.; Reddy, D.; Baraniuk, R.G.; Chellappa, R. Compressive Sensing for Background Subtraction; Springer: Berlin/Heidelberg, Germany, 2008; pp. 155–168. [Google Scholar]
- Huang, G.; Jiang, H.; Matthews, K.; Wilford, P. Lensless imaging by compressive sensing. In Proceedings of the IEEE International Conference on Image Processing, Melbourne, VIC, Australia, 15–18 September 2014; pp. 2101–2105. [Google Scholar]
- Chen, Y.; Peng, Z.; Cheng, Z.; Tian, L. Seismic signal time-frequency analysis basedon multi-directional window using greedy strategy. J. Appl. Geophys.
**2017**, 143, 116–128. [Google Scholar] [CrossRef] - Lu, S.L.T.; Fang, L. Spectral–spatial adaptive sparse representation for hyperspectral image denoising. IEEE Trans. Geosci. Remote Sens.
**2016**, 54, 373–385. [Google Scholar] [CrossRef] - Zhao, W.; Lu, H. Medical Image Fusion and Denoising with Alternating Sequential Filter and Adaptive Fractional Order Total Variation. IEEE Trans. Instrum. Meas.
**2017**, 66, 1–12. [Google Scholar] [CrossRef] - Knoll, F.; Bredies, K.; Pock, T.; Stollberger, R. Second order total generalized variation (TGV) for MRI. Magn. Reson. Med.
**2011**, 65, 480–491. [Google Scholar] [CrossRef] [PubMed] - Kong, D.; Peng, Z. Seismic random noise attenuation using shearlet and total generalized variation. J. Geophys. Eng.
**2015**, 12, 1024–1035. [Google Scholar] [CrossRef] - Zhu, Z.; Yin, H.; Chai, Y.; Li, Y.; Qi, G. A novel multi-modality image fusion method based on image decomposition and sparse representation. Inform. Sci.
**2018**, 432, 516–529. [Google Scholar] [CrossRef] - Marquina, A.; Osher, S.J. Image super-resolution by TV-regularization and Bregman iteration. J. Sci. Comput.
**2008**, 37, 367–382. [Google Scholar] [CrossRef] - Nikolova, M. A variational approach to remove outliers and impulse noise. J. Math. Imaging Vis.
**2004**, 20, 99–120. [Google Scholar] [CrossRef] - Rudin, L.I.; Osher, S.; Fatemi, E. Nonlinear total variation based noise removal algorithms. Phys. D Nonlinear Phenom.
**1992**, 60, 259–268. [Google Scholar] [CrossRef] - Qin, Z. An Alternating Direction Method for Total Variation Denoising. Optim. Methods Softw.
**2015**, 30, 594–615. [Google Scholar] [CrossRef] - Tai, X.C.; Wu, C. Augmented Lagrangian Method, Dual Methods and Split Bregman Iteration for ROF Model. In International Conference on Scale Space and Variational Methods in Computer Vision; Springer: Berlin/Heidelberg, Germany, 2009; pp. 502–513. [Google Scholar]
- Ng, M.; Wang, F.; Yuan, X.M. Fast Minimization Methods for Solving Constrained Total-Variation Superresolution Image Reconstruction. Multidimens. Syst. Signal Process.
**2011**, 22, 259–286. [Google Scholar] [CrossRef] - Chan, T.; Marquina, A.; Mulet, P. High-order total variation-based image restoration. SIAM J. Sci. Comput.
**2000**, 22, 503–516. [Google Scholar] [CrossRef] - Chan, T.F.; Esedoglu, S.; Park, F. A fourth order dual method for staircase reduction in texture extraction and image restoration problems. In Proceedings of the 2010 17th IEEE International Conference on Image Processing (ICIP), Hong Kong, China, 26–29 September 2010; pp. 4137–4140. [Google Scholar]
- Wu, L.; Chen, Y.; Jin, J.; Du, H.; Qiu, B. Four-directional fractional-order total variation regularization for image denoising. J. Electron. Imaging
**2017**, 26, 053003. [Google Scholar] [CrossRef] - Bredies, K.; Kunisch, K.; Pock, T. Total generalized variation. SIAM J. Imaging Sci.
**2010**, 3, 492–526. [Google Scholar] [CrossRef] - Feng, W.; Lei, H.; Gao, Y. Speckle reduction via higher order total variation approach. IEEE Trans. Image Process.
**2014**, 23, 1831–1843. [Google Scholar] [CrossRef] [PubMed] - Cheng, Z.; Chen, Y.; Wang, L.; Lin, F.; Wang, H.; Chen, Y. Four-Directional Total Variation Denoising Using Fast Fourier Transform and ADMM. In Proceedings of the IEEE 3rd International Conference on Image, Vision and Computing (ICIVC), Chongqing, China, 27–29 June 2018; pp. 379–383. [Google Scholar]
- Hajiaboli, M.R. An Anisotropic Fourth-Order Partial Differential Equation for Noise Removal; Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
- Zhang, J.; Chen, K. A Total Fractional-Order Variation Model for Image Restoration with Non-homogeneous Boundary Conditions and its Numerical Solution. Siam J. Imaging Sci.
**2015**, 8, 2487–2518. [Google Scholar] [CrossRef] - Chen, P.-Y.; Selesnick, I.W. Group-Sparse Signal Denoising: Non-Convex Regularization, Convex Optimization. arXiv
**2013**. [Google Scholar] [CrossRef] - Liu, J.; Huang, T.-Z.; Selesnick, I.W.; Lv, X.-G.; Chen, P.-Y. Image restoration using total variation with overlapping group sparsity. Inform. Sci.
**2015**, 295, 232–246. [Google Scholar] [CrossRef] - Adam, T.; Paramesran, R. Image denoising using combined higher order non-convex total variation with overlapping group sparsity. Multidimens. Syst. Signal Process.
**2018**, 1–25. [Google Scholar] [CrossRef] - Wu, Y.C.L.; Du, H. Efficient compressedsensing MR image reconstruction using anisotropic overlapping group sparsitytotal variation. In Proceedings of the 2017 7th International Workshop on Computer Science and Engineering, Beijing, China, 25–27 June 2017. [Google Scholar]
- Chen, Y.; Wu, L.; Peng, Z.; Liu, X. Fast Overlapping Group Sparsity Total Variation Image Denoising Based on Fast Fourier Transform and Split Bregman Iterations. In Proceedings of the 7th International Workshop on Computer Science and Engineering, Beijing, China, 25–27 June 2017. [Google Scholar]
- Chartrand, R. Exact Reconstruction of Sparse Signals via Nonconvex Minimization. IEEE Signal Process. Lett.
**2007**, 14, 707–710. [Google Scholar] [CrossRef] - Parekh, A.; Selesnick, I.W. Convex Denoising using Non-Convex Tight Frame Regularization. IEEE Signal Process. Lett.
**2015**, 22, 1786–1790. [Google Scholar] [CrossRef] - Li, S.; He, Y.; Chen, Y.; Liu, W.; Yang, X.; Peng, Z. Fast multi-trace impedance inversion using anisotropic total p-variation regularization in the frequency domain. J. Geophys. Eng.
**2018**, 15. [Google Scholar] [CrossRef] - Boyd, S.; Parikh, N.; Chu, E.; Peleato, B.; Eckstein, J. Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn.
**2011**, 3, 1–122. [Google Scholar] [CrossRef] - Liu, J.; Huang, T.-Z.; Liu, G.; Wang, S.; Lv, X.-G. Total variation with overlapping group sparsity for speckle noise reduction. Neurocomputing
**2016**, 216, 502–513. [Google Scholar] [CrossRef] - Goldstein, T.; O’donoghue, B.; Setzer, S.; Baraniuk, R. Fast alternating direction optimization methods. SIAM J. Imaging Sci.
**2014**, 7, 1588–1623. [Google Scholar] [CrossRef] - Chan, R.H.; Ho, C.-W.; Nikolova, M. Salt-and-pepper noise removal by median-type noise detectors and detail-preserving regularization. IEEE Trans. Image Process.
**2005**, 14, 1479–1485. [Google Scholar] [CrossRef] [PubMed] - Zhong, Q.; Wu, C.; Shu, Q.; Liu, R.W. Spatially adaptive total generalized variation-regularized image deblurring with impulse noise. J. Electron. Imaging
**2018**, 27, 053006. [Google Scholar] [CrossRef] - Trivedi, M.C.; Singh, V.K.; Kolhe, M.L.; Goyal, P.K.; Shrimali, M. Patch-Based Image Denoising Model for Mixed Gaussian Impulse Noise Using L1 Norm. In Intelligent Communication and Computational Technologies; Springer: Singapore, 2018; pp. 77–84. [Google Scholar]
- Zheng, L.; Maleki, A.; Weng, H.; Wang, X.; Long, T. Does ℓp-minimization outperform ℓ1-minimization? IEEE Trans. Inform. Theory
**2017**, 63, 6896–6935. [Google Scholar] [CrossRef] - Xie, Y.; Gu, S.; Liu, Y.; Zuo, W.; Zhang, W.; Zhang, L. Weighted Schatten $p$ -Norm Minimization for Image Denoising and Background Subtraction. IEEE Trans. Image Process.
**2016**, 25, 4842–4857. [Google Scholar] [CrossRef] - Zhou, X.; Molina, R.; Zhou, F.; Katsaggelos, A.K. Fast iteratively reweighted least squares for lp regularized image deconvolution and reconstruction. IEEE Int. Conf. Image Process.
**2015**, 24, 1783–1787. [Google Scholar] - Chen, F.; Zhang, Y. Sparse Hyperspectral Unmixing Based on Constrained lp—l2 Optimization. IEEE Geosci. Remote Sens. Lett.
**2013**, 10, 1142–1146. [Google Scholar] [CrossRef] - Chen, Y.; Peng, Z.; Gholami, A.; Yan, J.; Li, S. Seismic Signal Sparse Time-Frequency Analysis by Lp-Quasinorm Constraint. arXiv
**2018**. [Google Scholar] - Woodworth, J.; Chartrand, R. Compressed Sensing Recovery via Nonconvex Shrinkage Penalties. Inverse Probl.
**2016**, 32, 075004. [Google Scholar] [CrossRef] - Sidky, E.Y.; Chartrand, R.; Boone, J.M.; Pan, X. Constrained TpV Minimization for Enhanced Exploitation of Gradient Sparsity: Application to CT Image Reconstruction. IEEE J. Transl. Eng. Health Med.
**2014**, 2, 1–18. [Google Scholar] [CrossRef] [PubMed] - Wu, C.; Tai, X.C. Augmented Lagrangian Method, Dual Methods, and Split Bregman Iteration for ROF, Vectorial TV, and High Order Models. Siam J. Imaging Sci.
**2012**, 3, 300–339. [Google Scholar] [CrossRef] - Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process.
**2004**, 13, 600–612. [Google Scholar] [CrossRef] [PubMed]

**Figure 2.**Feasible regions of ${\mathrm{R}}_{\mathrm{ApTV}}\left(\mathit{F}\right)$: (

**a**) p = 1; (

**b**) 0 < p < 1.

**Figure 3.**Original images. Top row: from left to right, (

**a**) Girl.bmp, (

**b**) Milk drop.tiff, (

**c**) House.png, (

**d**) Reagan.bmp, (

**e**) Lena.png, (

**f**) Shoulder.jpg, (

**g**) Woman.tif, and (

**h**) Pepper.tiff.

**Figure 4.**Comparison of peak signal-to-noise ratio (PSNR) under different K values (Noise Level = 30%). The test images (“Girl”, “House”, ”Lena”) are corrupted by 30% impulse noise.

**Figure 5.**Comparison of structural similarity (SSIM) under different K values (Noise Level = 30%). The test images (“Girl”, “House”, ”Lena”) are corrupted by 30% impulse noise.

**Figure 6.**Comparison of image details recovered by our proposed models and other models. The test images (“Woman”, “Pepper”, ”Girl”, and ”House”) are corrupted by 20~50% impulse noise, respectively.

Image | House | Lena | Woman | Milk drop | Girl | Shoulder |
---|---|---|---|---|---|---|

Parameter | $\mathit{\mu}/\mathit{p}\text{}$ | $\mathit{\mu}/\mathit{p}\text{}$ | $\mathit{\mu}/\mathit{p}\text{}$ | $\mathit{\mu}/\mathit{p}\text{}$ | $\mathit{\mu}/\mathit{p}\text{}$ | $\mathit{\mu}/\mathit{p}\text{}$ |

Level | ||||||

20% | 0.14/0.45 | 0.15/0.45 | 0.15/0.65 | 0.14/0.65 | 0.14/0.65 | 0.14/0.65 |

30% | 0.15/0.45 | 0.15/0.65 | 0.15/0.7 | 0.16/0.65 | 0.15/0.65 | 0.15/0.65 |

40% | 0.15/0.45 | 0.16/0.65 | 0.15/0.7 | 0.19/0.65 | 0.15/0.65 | 0.15/0.65 |

50% | 0.18/0.55 | 0.17/0.65 | 0.2/0.7 | 0.2/0.65 | 0.18/0.55 | 0.18/0.55 |

**Table 2.**Numerical comparison of our proposed method and other models (images are corrupted by impulse noise of 20%). ATV: anisotropic total variation; ITV: isotropic total variation; TGV: total generalized variation; OGS-L1: overlapping group sparsity with L1-norm; OGS-Lp: overlapping group sparsity with pseudo-norm.

$\mathit{L}\mathit{e}\mathit{v}\mathit{e}\mathit{l}\text{}$ | Image | Method | The Output Seismic Signal | ||
---|---|---|---|---|---|

PSNR (dB) | SSIM | Time (s) | |||

20% | Lena | ATV | 28.71 | 0.8854 | 4.81 |

ITV | 28.83 | 0.8936 | 2.45 | ||

TGV | 28.63 | 0.8966 | 9.59 | ||

OGS-L1 | 29.79 | 0.9115 | 16.34 | ||

OGS-Lp | 31.47 | 0.9474 | 13.94 | ||

OGS-Lp-FAST | 31.55 | 0.9482 | 8.64 | ||

House | ATV | 32.31 | 0.8871 | 3.09 | |

ITV | 31.81 | 0.8888 | 1.84 | ||

TGV | 32.32 | 0.9135 | 8.38 | ||

OGS-L1 | 33.04 | 0.9127 | 15.27 | ||

OGS-Lp | 37.47 | 0.9667 | 12.45 | ||

OGS-Lp-FAST | 37.72 | 0.9679 | 10.48 | ||

Shoulder | ATV | 35.32 | 0.9636 | 5.42 | |

ITV | 35.33 | 0.9649 | 3.56 | ||

TGV | 35.29 | 0.9256 | 14.98 | ||

OGS-L1 | 37.00 | 0.9719 | 17.77 | ||

OGS-Lp | 38.89 | 0.9829 | 16.06 | ||

OGS-Lp-FAST | 38.92 | 0.983 | 14.61 | ||

Girl | ATV | 29.45 | 0.8868 | 4.53 | |

ITV | 30.05 | 0.894 | 3.22 | ||

TGV | 30.14 | 0.8907 | 9.58 | ||

OGS-L1 | 30.67 | 0.8995 | 13.48 | ||

OGS-Lp | 32.29 | 0.9365 | 14.17 | ||

OGS-Lp-FAST | 32.34 | 0.9371 | 11.69 | ||

Milk Drop | ATV | 32.32 | 0.8973 | 4.83 | |

ITV | 31.02 | 0.9039 | 3.63 | ||

TGV | 30.48 | 0.894 | 8.59 | ||

OGS-L1 | 33.35 | 0.911 | 16.39 | ||

OGS-Lp | 35.76 | 0.9533 | 13.42 | ||

OGS-Lp-FAST | 35.87 | 0.9538 | 8.58 | ||

Woman | ATV | 29.45 | 0.8868 | 4.53 | |

ITV | 29.65 | 0.9015 | 3.73 | ||

TGV | 29.84 | 0.8853 | 8.98 | ||

OGS-L1 | 30.35 | 0.908 | 16.03 | ||

OGS-Lp | 31.71 | 0.9395 | 16.22 | ||

OGS-Lp-FAST | 31.7 | 0.9398 | 8.69 |

**Table 3.**Numerical comparison of our proposed method and other models (images are corrupted by impulse noise of 30%).

$\mathit{L}\mathit{e}\mathit{v}\mathit{e}\mathit{l}\text{}$ | Image | Method | The Output Seismic Signal | ||
---|---|---|---|---|---|

PSNR (dB) | SSIM | Time (s) | |||

30% | Lena | ATV | 27.08 | 0.829 | 5 |

ITV | 27.06 | 0.837 | 2.88 | ||

TGV | 27.23 | 0.8319 | 8.16 | ||

OGS-L1 | 27.59 | 0.8543 | 9.56 | ||

OGS-Lp | 29.19 | 0.9035 | 15.72 | ||

OGS-Lp-FAST | 29.21 | 0.9039 | 7.48 | ||

House | ATV | 30.4 | 0.8717 | 3.8 | |

ITV | 30.16 | 0.8545 | 2.17 | ||

TGV | 30.82 | 0.8862 | 11.61 | ||

OGS-L1 | 31.5 | 0.8807 | 7.77 | ||

OGS-Lp | 35.03 | 0.9432 | 13.8 | ||

OGS-Lp-FAST | 35.13 | 0.9444 | 10.77 | ||

Shoulder | ATV | 34.48 | 0.9551 | 4.95 | |

ITV | 34.33 | 0.9556 | 3.94 | ||

TGV | 34.74 | 0.9493 | 22 | ||

OGS-L1 | 35.34 | 0.9599 | 16.97 | ||

OGS-Lp | 36.5 | 0.9633 | 18.88 | ||

OGS-Lp-FAST | 36.47 | 0.9628 | 5.88 | ||

Girl | ATV | 28.17 | 0.8586 | 5.17 | |

ITV | 28.53 | 0.8737 | 3.31 | ||

TGV | 28.82 | 0.8422 | 8.84 | ||

OGS-L1 | 29.11 | 0.8818 | 8.88 | ||

OGS-Lp | 30.42 | 0.9155 | 15.64 | ||

OGS-Lp-FAST | 30.41 | 0.9148 | 12.84 | ||

Milk Drop | ATV | 30.24 | 0.8788 | 4.95 | |

ITV | 30.1 | 0.8681 | 2.33 | ||

TGV | 29.42 | 0.8878 | 11.45 | ||

OGS-L1 | 31.08 | 0.8836 | 9.33 | ||

OGS-Lp | 32.7 | 0.9261 | 16.88 | ||

OGS-Lp-FAST | 33.19 | 0.9274 | 10.64 | ||

Woman | ATV | 27.86 | 0.8534 | 4.27 | |

ITV | 28.43 | 0.866 | 2.83 | ||

TGV | 28.32 | 0.844 | 9.53 | ||

OGS-L1 | 29.05 | 0.8725 | 11.59 | ||

OGS-Lp | 30.15 | 0.9063 | 17.25 | ||

OGS-Lp-FAST | 30.13 | 0.9047 | 10.89 |

**Table 4.**Numerical comparison of our proposed method and other models (images are corrupted by impulse noise of 40%).

$\mathit{L}\mathit{e}\mathit{v}\mathit{e}\mathit{l}\text{}$ | Image | Method | The Output Seismic Signal | ||
---|---|---|---|---|---|

PSNR (dB) | SSIM | Time (s) | |||

40% | Lena | ATV | 25.85 | 0.796 | 5 |

ITV | 25.8 | 0.8009 | 3.11 | ||

TGV | 26.2 | 0.8041 | 10.27 | ||

OGS-L1 | 26.22 | 0.8151 | 11.25 | ||

OGS-Lp | 27.64 | 0.8683 | 18.88 | ||

OGS-Lp-FAST | 27.67 | 0.8675 | 9.81 | ||

House | ATV | 28.5 | 0.8433 | 4.94 | |

ITV | 28.69 | 0.8356 | 2.31 | ||

TGV | 29.21 | 0.8284 | 9.48 | ||

OGS-L1 | 29.31 | 0.8566 | 11.95 | ||

OGS-Lp | 32.84 | 0.9182 | 13.77 | ||

OGS-Lp-FAST | 32.92 | 0.9189 | 12.8 | ||

Shoulder | ATV | 32.4 | 0.9328 | 4.86 | |

ITV | 32.54 | 0.9357 | 3.86 | ||

TGV | 32.8 | 0.9345 | 24.81 | ||

OGS-L1 | 32.62 | 0.9310 | 16.36 | ||

OGS-Lp | 33.25 | 0.9510 | 20.66 | ||

OGS-Lp-FAST | 33.24 | 0.9509 | 17.61 | ||

Girl | ATV | 27.19 | 0.8348 | 4.94 | |

ITV | 27.33 | 0.8447 | 3.09 | ||

TGV | 27.86 | 0.8135 | 12.11 | ||

OGS-L1 | 27.88 | 0.8526 | 11.78 | ||

OGS-Lp | 28.87 | 0.8861 | 18.05 | ||

OGS-Lp-FAST | 28.86 | 0.8847 | 12.69 | ||

Milk Drop | ATV | 27.51 | 0.8307 | 3.97 | |

ITV | 28.1 | 0.8424 | 3.19 | ||

TGV | 28.09 | 0.8521 | 12.97 | ||

OGS-L1 | 29.34 | 0.8569 | 10.27 | ||

OGS-Lp | 30.56 | 0.8938 | 16.94 | ||

OGS-Lp-FAST | 30.46 | 0.8933 | 13 | ||

Woman | ATV | 26.97 | 0.8338 | 4.67 | |

ITV | 27.13 | 0.8366 | 3.63 | ||

TGV | 27.34 | 0.7708 | 9.58 | ||

OGS-L1 | 27.7 | 0.8483 | 13.13 | ||

OGS-Lp | 28.27 | 0.8722 | 18.45 | ||

OGS-Lp-FAST | 28.29 | 0.8716 | 13.11 |

**Table 5.**Numerical comparison of our proposed method and other models (images are corrupted by impulse noise of 50%).

$\mathit{L}\mathit{e}\mathit{v}\mathit{e}\mathit{l}\text{}$ | Image | Method | The Output Seismic Signal | ||
---|---|---|---|---|---|

PSNR (dB) | SSIM | Time (s) | |||

50% | Lena | ATV | 23.44 | 0.7353 | 6 |

ITV | 23.56 | 0.7457 | 4.14 | ||

TGV | 25.05 | 0.7631 | 13.44 | ||

OGS-L1 | 25.08 | 0.7612 | 15.86 | ||

OGS-Lp | 25.74 | 0.8264 | 19.06 | ||

OGS-Lp-FAST | 25.72 | 0.8262 | 11.67 | ||

House | ATV | 26.48 | 0.8046 | 3.61 | |

ITV | 27.09 | 0.8139 | 2.88 | ||

TGV | 27.88 | 0.8264 | 15.14 | ||

OGS-L1 | 27.85 | 0.8245 | 10.05 | ||

OGS-Lp | 31.24 | 0.8932 | 16.41 | ||

OGS-Lp-FAST | 31.15 | 0.8923 | 14.61 | ||

Shoulder | ATV | 30.99 | 0.9137 | 4.91 | |

ITV | 31.07 | 0.9168 | 3.59 | ||

TGV | 31.93 | 0.8954 | 24.31 | ||

OGS-L1 | 30.99 | 0.9137 | 16.5 | ||

OGS-Lp | 31.94 | 0.9161 | 19.45 | ||

OGS-Lp-FAST | 32.05 | 0.9277 | 17.72 | ||

Girl | ATV | 25.1 | 0.7761 | 5.58 | |

ITV | 25.73 | 0.7977 | 3.3 | ||

TGV | 26.75 | 0.814 | 15.67 | ||

OGS-L1 | 26.45 | 0.811 | 14.16 | ||

OGS-Lp | 27.32 | 0.846 | 19.28 | ||

OGS-Lp-FAST | 27.35 | 0.8463 | 15.91 | ||

Milk Drop | ATV | 26.19 | 0.8082 | 4.92 | |

ITV | 26.51 | 0.8188 | 3.42 | ||

TGV | 27.01 | 0.8349 | 19.05 | ||

OGS-L1 | 27.61 | 0.8351 | 14.67 | ||

OGS-Lp | 28.65 | 0.868 | 17.8 | ||

OGS-Lp-FAST | 28.59 | 0.8682 | 14.11 | ||

Woman | ATV | 25.84 | 0.8049 | 5.11 | |

ITV | 25.24 | 0.7939 | 3.67 | ||

TGV | 25.99 | 0.685 | 10.84 | ||

OGS-L1 | 26.29 | 0.8143 | 13.95 | ||

OGS-Lp | 26.56 | 0.8165 | 18.19 | ||

OGS-Lp-FAST | 26.61 | 0.8169 | 14.02 |

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).