Directional (cid:96) 0 Sparse Modeling for Image Stripe Noise Removal

: Remote sensing images are often polluted by stripe noise, which leads to negative impact on visual performance. Thus, it is necessary to remove stripe noise for the subsequent applications, e.g., classiﬁcation and target recognition. This paper commits to remove the stripe noise to enhance the visual quality of images, while preserving image details of stripe-free regions. Instead of solving the underlying image by variety of algorithms, we ﬁrst estimate the stripe noise from the degraded images, then compute the ﬁnal destriping image by the difference of the known stripe image and the estimated stripe noise. In this paper, we propose a non-convex (cid:96) 0 sparse model for remote sensing image destriping by taking full consideration of the intrinsically directional and structural priors of stripe noise, and the locally continuous property of the underlying image as well. Moreover, the proposed non-convex model is solved by a proximal alternating direction method of multipliers (PADMM) based algorithm. In addition, we also give the corresponding theoretical analysis of the proposed algorithm. Extensive experimental results on simulated and real data demonstrate that the proposed method outperforms recent competitive destriping methods, both visually and quantitatively.


Introduction
Stripe noise (all denoted as "stripes" in this paper), which is generally caused by the inconsistency of the detecting element scanning or the influence of the detector moving and temperature changes, etc., is a universal phenomenon in remote sensing images.They may result in a bad influence not only on visual quality but also on subsequent applications in remote sensing images.Therefore, it is necessary to remove stripes and simultaneously maintain the healthy pixels from the degraded images.In general, the stripes have strongly directional and structural information, e.g., pixels normally damaged on row by row or column by column.
Recently, many approaches for destriping problems have been proposed, which may be roughly divided into three categories, mainly including filtering-based methods, statistics-based methods and optimization-based methods.Note that the proposed method belongs to the category of optimization-based methods.
The filtering-based methods, which generally obtain the results with variety of filters, have been widely utilized for remote sensing image destriping [1][2][3][4].In Ref. [1], Chen et al. proposed an approach for remote sensing image destriping tasks based on a finite-impulse-response filter (FIR) in frequency domain.Although this method exhibits good results on the experimental CMODIS data, it inevitably leads to ringing and ripple artifacts.In Ref. [3], the wavelet analysis and adaptive Fourier zero-frequency amplitude normalization was used for hyperspectral image destriping problems, and this wavelet-based method had shown the promising ability for both stripes and random noise.
The statistics-based methods are mainly to analyze the distribution of stripes.These approaches have strong directional characteristic which is considered as the stripe prior for the remote sensing image destriping [5][6][7][8][9][10][11].In Ref. [7], Weinreb et al. introduced a method based on matching empirical distribution functions (EDFs) for GOES-7 data, while the limitations and unstable property were caused by assuming the similarity and regularity among the stripes.To conquer the instability when the stripes are irregular or nonlinear, Rakwatin et al. [9] introduced a method, using both histogram-matching algorithm and local least squares fitting, to remove the stripes of Aqua MODIS band 6.In Ref. [10], spectral moment matching (SpcMM) method, which can automatically remove variety of frequencies stripes in a specific band, was proposed for Hyperion image destriping.In addition, Shen et al. [11] employed a piece-wise destriping method, which uses correction coefficients of each portion by considering neighboring normal row, for nonlinear and irregular stripes, but it cannot automatically select a threshold value to divide the image into different parts.
Recently, the optimization-based methods have shown superiorities for remote sensing image destriping problems [12][13][14][15][16][17][18][19][20][21][22][23].The image destriping generally results in an ill-posed problem which fails to obtain a meaningful, stable and unique solution by the direct and traditional methods.Therefore, a common strategy for ill-posed problems is to construct a regularization model via investigating the priors of underlying image.For the optimization-based methods, they focus on searching and discovering the intrinsically prior knowledge to generate reasonable regularization models.In Ref. [24], Zorzi et al. proposed a sparse and low-rank decomposition for models specified in the time domain, and the authors in [25], proposed a sparse and low-rank decomposition of the inverse of the manifest spectral density to get a simpler graphical model.In Ref. [17], the authors presented a unidirectional total variational (UTV) model for MODIS (i.e., moderate-resolution imaging spectroradiometer) image stripes removal by fully considering the directional information of stripes.The UTV model is motivated by the classical TV model and the analysis of directional stripes.Chang et al. [21] proposed an optimization model combining the UTV with sparse priors of stripes applying to denoising and destriping simultaneously.In Ref. [22], the authors utilized the split Bregman iteration method with an anisotropic spectral-spatial total variation regularization to remove multispectral image stripes.
In summary, although these optimization-based methods can yield effective results of removing stripes, there still exists much room to improve.Most of them are implemented only from the perspective of noise removal, but without considering the typical properties of stripes, e.g., directional and structural properties.Even though they consider these properties, the formulated sparse destriping models fail to accurately depict the typical properties of stripes [26,27].Moreover, the designed algorithms for non-convex models, e.g., 0 sparse model, generally do not have the convergence analysis.These motivate us to develop a new model and design the corresponding effective algorithm, that theoretically guarantees the convergence, to solve the remote sensing destriping problems.
In this paper, to remove the stripes of remote sensing images, we propose a non-convex sparse model which mainly consists of three sparse priors, including an 0 sparse prior by fully considering the directional property of stripes (y-axis), an 1 sparse prior by considering the discontinuity of underlying image (x-axis), and the sparsity of stripes by considering the structural property of stripes.The framework of the proposed method is shown in Figure 1.Moreover, we design a proximal alternating direction method of multipliers (PADMM) based algorithm to solve the proposed non-convex sparse model.Actually, PADMM method mainly incorporates a proximal term into the well-known ADMM method which has been applied to many image processing applications, e.g., image deblurring [28], image denoising [29], tensor completion [30], etc.Note that, after adding this additional proximal term, the resulting algorithm is actually not an ADMM algorithm.The Karush-Kuhn-Tucker (KKT) conditions is the first-order necessary for a solution in nonlinear programming to be optimal.In particular, the convergence to the KKT point of the optimization problem is theoretically proven in the work.The outline of this paper is organized as follows.In Section 2, we will briefly introduce the related work.The proposed model and detailed solving algorithm will be shown in Section 3. In Section 4, we compare the proposed method with some competitive remote sensing image destriping methods, and give the discussions about the proposed method.Finally, conclusions are drawn in Section 5.

Destriping Problem Formulation
The striping effects in remote sensing images mainly make up of additive and multiplicative components [15].However, the multiplicative stripes can be described as additive case by the logarithm [31].Thus, many types of research focus more on the additive stripes model b(x, y) = u(x, y) + s(x, y) where b(x, y), u(x, y) and s(x, y) denote the components of the observed image, the underlying image and stripes at the location (x, y), respectively.For convenience, a matrix-vector form can be written as follows where b, u and s ∈ R n represent the lexicographical order vectors of b(x, y), u(x, y) and s(x, y), respectively.The purpose of our work is to estimate the stripes s, then the underlying image will be recovered by the formula of u = b − s.

UTV for Remote Sensing Image Destriping
The total variation (TV) model, which is first proposed by Rudin, Osher, and Fatemi (ROF) [32], has shown powerful ability in many image applications, e.g., image unmixing [33], image deblurring [34], image inpainting [35], etc.It has the following form where λ is a positive regularization parameter, and TV(u) represents the regularization expressed as In many approaches, s(x, y) is usually regarded as constant in a given line.Although this assumption has shown stability in MOS-B, it fails in MODIS.Not only predominant nonlinear effects, but also the data quality of random stripes have been obtained in many emissive bands.Thus, more realistic assumptions are introduced to design an efficient destriping method.
Without loss the generality, we can assume that the stripes are along the vertical direction (y-axis).In mathematical words, most pixels of the stripe noise s hold the following property [17]: where we denote y-axis is along stripes direction, and x-axis is across stripes direction.When (x, y) is not the location of a pixel belonging to the stripes, both sides of Equation ( 5) are almost the same, denoted as ∂s(x, y) ∂y By the relation in Equations ( 5) and ( 6), we get Ω ∂s(x, y) ∂y dxdy which means TV y (s) where TV x and TV y are horizontal and vertical variations, respectively.The authors in [17] encourage the robustness of stripes removal by minimizing the unidirectional total variation (UTV) model as follows which can be solved by Euler-Lagrange equation based algorithm.In Ref. [17], the UTV model can effectively deal with remote sensing image destriping problems, which has been demonstrated holding promising ability on Aqua and Terra MODIS data.Although TV model preserves image edges well, it can not accurately depict the specific directional property of stripes, and leads to undesired results.The UTV model that involves unidirectional constraint can remove stripes in the meanwhile not destroy the underlying image details.Inspired by the UTV model, we fully consider the intrinsically directional and structural priors of stripes and the continuous property of the underlying image.Finally, we form a unidirectional and sparsity-based optimization model.

The Proposed Method
Combining the stripes model in Equation (2), we will give the proposed optimal model with unidirectional prior motivated by the extension of the UTV model.In what follows, the detailed explanations of the proposed model and the corresponding solving algorithm will be exhibited.Section 3.1 is the proposed model, Section 3.2 is the existing work, and Section 3.3 is the designed algorithm based on the existing work Section 3.2.

Local Smoothness Along Stripe Direction
The stripes of remote sensing images generally appear with column-by-column (y-axis) or row-by-row (x-axis).Without loss of generality, we view all stripes as column-by-column case to formulate the finally directional model (the row-by-row stripes can be easily rotated to column-by-column stripes to fit in the proposed model).Considering the smoothness of the stripes, the difference between adjacent pixels is quite small, or even close to zero, thus we generally use sparse prior to this character along the stripe direction (y-axis).The first regularization for the difference within the stripes is given as follows R 1 = ||∇ y s|| 0 , (10) where ∇ y is a partial differential operator along stripe direction.We denote that ∇ y u represents the vector form of ∇ y U where U is a 2D image and the definition of ∇ x u is similar to ∇ y u.Comparing with some popular sparse measures, e.g., 1 -norm and p -norm (0 < p < 1), the 0 -norm that stands for the number of non-zero elements of a vector is a promising measure to depict sparse property, thus here we employ 0 -norm to describe the sparsity of ∇ y s.Although this term will lead to the non-convexity of the proposed model, we utilize the designed PADMM based algorithm to guarantee the solution converging to the KKT point.

Local Continuity of the Underlying Image
In general, the underlying image u along the x-axis is viewed as being continuous.When adding column-by-column stripes s to the underlying image, the local continuity of u is broken, which means that we should force ∇ x u to be small to keep the continuity of u.By this assumption and the relation u = b − s, we utilize the following 1 -norm regularization to describe the local continuity of the underlying image where ∇ x represents the difference operator in the across-stripe direction.Note that this term is actually the second term of the UTV model in Equation (9).

Global Sparsity of Stripes
In many destriping approaches [26,27,36,37], the stripes can be naturally viewed as being sparse when the stripes are not dense.Here, we take the 1 -norm to depict the sparsity of stripes: Even though the stripes are dense, this sparse term in Equation ( 12) must be retained, since it can effectively avoid the undesired effect and keep the robustness of the proposed method (see more discussion in the Results Section).
Combining the above three regularization terms, we finally formulate the 0 sparse model for remote sensing image destriping, where µ and λ are two positive parameters.
Note that the proposed model in Equation ( 13) is similar to the model in [26], since they both employ the directional property of stripes.However, there still exists an important difference that the model in [26] enforces 1 norm to ∇ y s and 0 norm to s whereas our model enforces 1 norm to s and 0 norm to ∇ y s.From the experiment results, our optimal model can get better performance based on PADMM algorithm.For instance, Figure 2 shows the number of non-zero elements of s (Figure 2a) and ∇ y s (Figure 2b), where s is estimated from a real image example by the method [26], it is clear that ∇ y s is almost all around 0, whereas s is not.The 0 norm is a promising way to depict sparsity, thus our model enforces 0 norm to ∇ y s.
In what follows, we will exhibit how to solve the proposed non-convex sparse model by introducing the PADMM based algorithm, as well as give the theoretical analysis of the convergence., where s is estimated from a real image example (see Figure 3) by the method [26].The y-axis represents the number of nonzero elements of s and ∇ y s, respectively, and the x-axis stands for the image pixel value whose range is [0, 1].

The Solution
For the non-convex 0 regularization term, there exist many approaches to approximate it, e.g., 1 -norm [38], the logarithm function [39] or the penalty decomposition algorithm (PDA) [40].In this work, we first present a work in Lemma 1, i.e., the mathematical program with equilibrium constraints (MPEC) [36], to transfer the non-convex 0 regularization term to the other equivalent one.Then, we can design a PADMM based algorithm to efficiently solve the equivalent model, in the meanwhile theoretically guarantee the convergence.
Lemma 1 (MPEC equation [36]).For any given w ∈ R n , it holds that where represents the inner product of two vectors, and denotes the elementwise product.The absolute of w stands for the absolute value of each element of w.Then, v * = 1 − sign(|w|) is the unique optimal solution of the minimization problem in Equation ( 14), and sign( .
From Lemma 1, the 0 -norm sparse optimization model in Equation ( 13) is equivalent to According to the analysis of [36], if s * is the globally optimal solution of Equation ( 13), then (s * , 1 − sign(|∇ y s * |)) is the unique global minimizer of Equation ( 15).Note that Equation ( 15) is still a non-convex problem, and the non-convexity is only caused by the constraint v |∇ y s| = 0.However, the problem in Equation ( 15) is similar to the main problem in [36], which is efficiently solved by a PADMM based algorithm that theoretically guarantees the convergence.Therefore, we employ the designed PADMM based algorithm to solve the resulted problem in Equation ( 15), as well as give the theoretical analysis of the convergence.
In the following, we will use the PADMM based algorithm to solve the optimization problem in Equation(15).

PADMM Based Algorithm
Considering the non-smooth 1 terms in the problem in Equation ( 15), we take the following variable substitutions to get the new optimization problem, with the auxiliary variables h, z, w ∈ R n .The augmented Lagrangian function L of Equation ( 16) is as follows where π 1 , π 2 , π 3 , and π 4 are Lagrange multipliers, and β 1 , β 2 , β 3 , and β 4 are positive parameters.
The minimization problem in Equation ( 17) can be solved by the PADMM based algorithm.Next, we discuss the solution of each subproblem.
(1) The h-subproblem can be written to the minimized problem as follows Now, let h i is the i-th pixel of h and we discuss two situations when the element h i = 0, if h i > 0, In summary, the h-subproblem has the closed-form solution as follows where (2) The z-subproblem is given as follows min which has the closed-form solution by soft-thresholding strategy [41] where Shrink(a, T) = sign(a) * max(|a − T|, 0).
(3) Similar to z-subproblem, w-subproblem is written as follows The problem in Equation ( 24) has the following closed-form solution by the soft-shrinkage formulation, where The v-subproblem can be written as follows where (5) Here, PADMM based algorithm needs to introduce an extra convex proximal term where Then, Equation (28) will be equivalent to: where (6) Finally, we update the Lagrangian multipliers by Combining Steps ( 1)-( 6), we formulate the final algorithm to iteratively solve the proposed 0 sparse model in Equation (13).In particular, the subproblems all have the closed-form solutions to ensure the accuracy of the algorithm.Finally, the solving process has been summarized in Algorithm 1.
In Algorithm 1, λ, µ, β 1 , β 2 , β 3 , and β 4 are some pre-defined parameters, while tol and M iter represent the positive tolerance value and the maximum iterations, respectively.In this work, we set tol = 1/255 and M iter = 10 3 .In the following, we discuss the convergence of Algorithm 1.

Experiment Results
In this section, we compare the proposed method with several competitive destriping methods, including the wavelet Fourier adaptive filter (WFAF) [3], the statistical linear destriping (SLD) [31], the unidirectional total variation model (UTV) [17], the global sparsity and local variational (GSLV) [26], and the Low-Rank Single-Image Decomposition (LRSID) [27], on both simulated and real remote sensing data.The codes of these methods, except the GSLV method, are available on the website (http://www.escience.cn/people/changyi/codes.html).Moreovre, we provide the code of the proposed method to compare the results on the website (http://www.escience.cn/people/dengliangjian/codes.html).As suggested in [27], we utilize the same periodic/nonperiodic stripes function adding stripes intensity [0, 255] to the underlying images.By the similar measure as in [27], the degraded images were normalized to [0, 1].All experiments are conducted in MATLAB (R2016a) on a desktop with 16Gb RAM and Inter(R) Core(TM) CPU i5-4590: @3.30GHz.
To evaluate the effects of different destriping methods, we will compare several qualitative and quantitative assessments.On the qualitative aspect, we show the visual results, the mean cross-track profile and the power spectrum of different methods.We also employ some acknowledged indexes, i.e., peak signal-to-noise ratio (PSNR) [42], structural similarity index (SSIM) [42], the inverse coefficient of variation (ICV) [15], mean relative deviation (MRD) [15], and the relative error (ReErr), to evaluate the performance of different approaches.These indexes are as follows, where û and u are the estimated underlying image and the original underlying image.n represents the total number of pixels of one image.
where µ x and µ y represent the average value of x and y images.σ x and σ y stand for the variance of x and y images, and σ xy is the covariance of these two images.C 1 and C 2 are constant here.
where the s added and s estimated represent the added stripes and estimated stripes by different methods, respectively.
where R a is the average value of the pixels with a window in a homogenous image region, and R b is the standard deviation of the pixels of the patch in this window.ICV calculates the homogeneous striped regions in estimated underlying image.
where ûi is the estimated underlying image, and u i is the original underlying image.MRD index shows the performance to retain the noise-free regions.Among these indices, ICV and MRD are non-reference indexes, and the others are full-reference indexes.Since stripes of the real images are dense, it is difficult to select several samples with the size of 10 × 10 window.In this paper, we select five different samples within a 6×6 window, and the mean ICV (MICV) and the mean MRD (MMRD) are shown in Table 4 for the six real experiments.Then, we will discuss how to select parameters.We note that we test the comparing methods according to the default or suggested parameters in their papers and codes.

Simulated Experiments
In simulated experiments, the stripes with periodic (Per) and nonperiodic (NonPer) noise are mainly determined by "Intensity" and r.Here, the "Intensity" means the added absolute value of the stripe scope, and the r represents the stripes ratio level within the remote sensing images.For convenience to compare, different stripes added to remote sensing images will be denoted as a vector with three elements, e.g., (Per, 10, 0.2) which represents the periodic stripes, the "Intensity" 10 and stripes ratio 0.2.
We take six experimental images, which the first and second are available on the website ("DigitalGlobe" with http://www.digitalglobe.com/product-samples), the sixth is available on the website (https://earthexplorer.usgs.gov/),and the second, fourth and fifth examples are available on the website (MODIS data with https://ladsweb.nascom.nasa.gov/), to test the performance of different methods.In the simulated experiments, the experimental images include: Washington DC Mall band 30, MODIS band 20, MODIS band 32, two small parts of the downloaded "Map Scale Ortho" image ("Map Scale Ortho" can be freely downloaded from the DigitalGlobe http://www.digitalglobe.com/resources/product-samples/30cm-imagery.We only use the two small parts of the downloaded "Map Scale Ortho" image acquired by the Satellite WorldView-3, since the "Map Scale Ortho" image is too large).To compare these methods clearly, we zoom in destriping details on the bottom left or bottom right of the image.

Periodic Stripes
For the periodic stripes case, we only take one example, i.e., the first column of Figure 3 with added stripes (Per, 10, 0.2), to compare the performance.Most of all existing methods performs quite effective to remove the stripes due to the simple structures of periodic stripes.The first column of Figure 3 also demonstrates the consistent conclusion that all comparing approaches remove the periodic stripes and well preserve the image details of stripe-free regions.

Nonperiodic Stripes
For the nonperiodic stripes case, we test five remote sensing images from the second column to the end column in Figure 3 with added stripes (NonPer, 100, 0.6), (NonPer, 50, 0.2), (NonPer, 60, 0.4), (NonPer, 100, 0.4) and (NonPer, 50, 0.6), respectively.Then, we display the destriping results of WFAF, SLD, UTV, GSLV, LRSID and the proposed method for different simulated remote sensing images starting from the third row to the end row.See the visual results of the second column, the WFAF method has an obvious black line and changes the intensity contrast of the underlying image.Although the other comparing methods can remove stripes, some regions change the intensity contrast of the underlying image on the left and the right parts, and the proposed method shows a good performance.Then, from the third to sixth examples, we can clearly observe the residual stripes and blurring effects resulted by the others comparing methods.Moreover, our method not only removes stripes completely but also preserves image details well.In Figure 4, we display the smaller patches of Figure 3 for visual quality comparisons, and our results have a better performance than the others.
Figure 5 shows the estimated stripes based on Figure 3.In Figure 5, we can see that the other methods may generate blurring effect and change intensity contrast.Meanwhile, the estimated stripes of the proposed method neither eliminate image structures nor bring in blurring effects for both periodic and nonperiodic stripes cases.
In Table 1, we show the ReErr results between the added stripes and estimated ones of Figure 5 from the quantitative aspect.Moreover, the PSNR values for different images have been shown in Table 1, and it is easy to know that our results outperform the other methods.

Averagely Quantitative Performance on 32 Test Images
To quantitatively test robustness and effectiveness of the proposed method, Tables 2 and 3 report the averagely quantitative comparisons of 32 remote sensing images, which are randomly selected from three websites ("DigitalGlobe" with http://www.digitalglobe.com/product-samples;some subimages of "hyperspectral image of Washington DC Mall" with https://engineering.purdue.edu/~biehl/MultiSpec/; "MODIS" data with https://ladsweb.nascom.nasa.gov/).In the tables, the best PSNR and SSIM results have been highlighted in bold.Especially, we compare these methods on 32 remote sensing images with fixed parameters for each method.Table 2 shows the PSNR and SSIM results on periodic stripes with different stripe levels.Although variance of PSNR is not the smallest, the SSIM of the proposed method holds the best performance, and SSIM is an important index to indicate stability on structural similarity of one method.Therefore, the proposed method can improve the PSNR and SSIM results to remove stripes.
For the nonperiodic stripes, we show the mean value results in Table 3.The WFAF method shows the instability, and the PSNR and SSIM of LRSID method are consistent with the results in [27].From the two tables, our method always shows a good performance.
In Figure 6, we take two examples of Table 2 to show the PSNR and SSIM performance of all comparing methods on each image.The y-axis stands for the value of PSNR or SSIM and the x-axis represents the i-th image of 32 examples.Figure 6I,II show the PSNR and SSIM performance of stripes (Per, 100, 0.6).Moreover, Figure 6III,IV show that of stripes (NonPer, 50, 0.2).Although the PSNR results fluctuate with respect to different images, our method holds the best PSNR results on most of all images.Moreover, the SSIM results show the best performance with the smallest variance, which is consistent with the results of Tables 2 and 3. From Figure 6, our method is superior to the other comparing methods.

Real Experiments
We also display the destriping results of six methods for six real remote sensing images, which are also available on the website (https://ladsweb.nascom.nasa.gov/)(see Figure 7).In the real experiments, we show the six real images: MODIS band 34, original band 28, Original Aqua MODIS band 30, MODIS band 136, Original MODIS Terra band 27, and Terra Africa MODIS band 33.Similar to Figure 3, the six real images with different stripes are shown in the first row, and the destriping results of all comparing methods are presented from the second row to the end row.
In Figure 8, we show the extracted stripe components of compared methods for the six real images in Figure 7.It seems that the proposed method removes too much background information in the real image experiments.Moreover, the non-reference indexes of MICV and MMRD have been shown in Table 4, and the best results have been highlighted.The proposed method shows a good performance in most of the cases.Although the results of the first two images are not the best, the results have only little difference among these compared methods.The indexes of MICV and MMRD depend on the selected patches and the window size.Different selected patches and window size may result in the performance of the two indexes.Here, we utilize them to partially evaluate the quantitative performance of the real image destriping.In Figure 7, for the first, fifth and last real images, the proposed method not only removes the stripes completely, but also preserves image details on stripe-free regions well.Note that the methods GSLV and LRSID fail to obtain desired results for the first image as the mentioned in their papers.For the fourth column, there are also several stripe residuals with WFAF and SLD, and the wide black shadow areas appear by the UTV, GSLV and LRSID methods.Moreover, the destriping results of the WFAF and SLD leave obvious stripes for the second image, and still exist the wispy stripes for the third example.According to several real experiments, the results demonstrate the universal effectiveness and stability of the proposed method.For the further comparisons of different destriping methods for simulated and real remote sensing images, we show the following two assessments.One is the mean cross-track profile that the x-axis stands for the column number of an image and the y-axis represents the mean value of each column (see Figures 9 and 10).The other is the power spectrum that the x-axis is the normalized frequencies of an image, and the y-axis shows the spectral magnitude with a logarithmic scale (see Figures 11 and 12).In simulated experiments, the mean cross-track profile of the first image of Figure 3 is shown in Figure 9.Note that Figure 9a shows the mean cross-track profile of the underlying image, and Figure 9b is the result of the degraded image.Moreover, Figure 9c-f shows the mean cross-track profile results of the six destriping methods, respectively.From the overall perspective, Figure 9d,e shows the obvious change of the intensity contrast.Seeing the details, Figure 9c-g has some mild fluctuations which are different from the underlying image in Figure 9a.The performance of the proposed method is almost same as that of the original one.
In addition, the power spectrum results of the second image of Figure 3 is shown in Figure 11.We denote the power spectrum results as Figure 11a-h which represent the power spectrum results of the underlying image, the degraded image and the destriping results of six methods, respectively.Figure 11c-g has more fluctuations which indicate these methods may have the stripe residuals or bring a little new noise in their destriping processes.Our method, i.e., Figure 11h, it not only removes all stripes, but also preserves almost the essential details such as edges.In real experiments, we also show the mean cross-track profile and the power spectrum in Figures 10 and 12, respectively.Figure 10 shows the mean cross-track profile results of the first column of Figure 7.Note that Figure 10a is the mean cross-track profile result of the first real remote sensing image, and Figure 10b-g shows the profile results of the six destriping methods, respectively.In general, the profiles of the destriping method should smoothen huge fluctuates and maintain primary structure information.However, the profiles of WFAF and LRSID have obvious fluctuations where the stripes still exist, and that of SLD is over-smooth which loses a lot of underlying image details.In Figure 10d,e, although stripes are mostly removed, the destriping profiles have some mild blur and too much smoothness because of the unidirectional property of UTV and the global sparsity of GSLV, respectively.In addition, the profile of the proposed method, i.e., Figure 10g, can realize the desired result both on removing stripes and keeping underlying image details.
In Figure 12, the power spectrum results of the fourth example of Figure 7 are plotted.Figure 12a-h represent the power spectrum results of the fourth real remote sensing image and six destriping methods, respectively.We observe that the real remote sensing image in Figure 12a has many fluctuations where stand for stripes.According to the power spectrum results of the six methods in Figure 12b-f, although the stripes are almost removed well, there are still some slight blurring regions, while the proposed method shows the desired performance in Figure 12b.

The Influence of Different Regularization Terms in the Proposed Model
Fully considering the destriping problem in Equation ( 2) and the optimization model in Equation ( 13), we assume that R 2 is a necessary term, since R 2 is the only term to describe the property of the underlying image u.To confirm whether both R 1 and R 2 are necessary priors as well as have important contribution to destriping performance, in Figure 13, we give the mean value of PSNR and SSIM for 32 images as before.Here, R 12 represents R 1 + R 2 , R 23 stands for R 2 + R 3 and R 123 represents R 1 + R 2 + R 3 (i.e., the proposed model).Please find the definitions of R 1 , R 2 , R 3 from Equations ( 10)-( 12), respectively.Figure 13I,II show the mean value of PSNR and the mean value of SSIM on 32 images same as before for periodic stripes.The periodic stripe levels in Figure 13a-f are (Per, 10, 0.2), (Per, 10, 0.6), (Per, 50, 0.2), (Per, 50, 0.6), (Per, 100, 0.2) and (Per, 100, 0.6), respectively.Moreover, Figure 13III,IV display the mean value of PSNR and the mean value of SSIM on 32 images for nonperiodic stripes.The nonperiodic stripe levels in Figure 13a-f stand for (NonPer, 10, 0.2), (NonPer, 10, 0.6), (NonPer, 50, 0.2), (NonPer, 50, 0.6), (NonPer, 100, 0.2) and (NonPer, 100, 0.6), respectively.
From the results in Figure 13, we can conclude three points.(1) The PSNR results of the proposed model (i.e., R 123 ) have the best performance compared to those of the other two models (i.e., R 12 and R 13 ), and the SSIM results have the similar performance as compared with the PSNR results.(2) R 23 shows more stability than R 12 as the green bars do not significantly change with different stripes.
(3) R 3 actually plays a more important role than R 1 with respect to PSNR (see Figure 13I,III.On the contrary, R 1 plays a more important role than R 3 with respect to SSIM (see Figure 13II,IV).Figure 13 demonstrates the effectiveness of the proposed model and the importance of the three terms.(III) (IV)

Parameter Value Selection
In this paper, the proposed method mainly involves six parameters λ, µ, β 1 , β 2 , β 3 , and β 4 .The stripes of different types can be removed by setting different parameters.For example, if the stripes are dense, the µ should be small and the λ should be large.
We tune the parameters by given strategy as follows.For instance, Figure 14, shows contour maps of PSNR values of different parameters setting for one example.In Figure 14I, we tune the regularization parameters λ, µ, and fix β 1 , β 2 , β 3 and β 4 .The yellow region is the best choice for the highest PSNR, and the orange is the next.Thus, we empirically set the regularization parameters within the range [0.1,10] for λ and [0.1, 1] for µ from this kind of contour maps for different examples.Similarly, from the Figure 14II, we empirically set the regularization parameters within the range [1,10] for β 2 and β 3 since most of all PSNR results are similar.Then, λ 1 is empirically set with [1, 10 2 ], and β 4 is set with [1, 10 3 ].Note that, we tune these parameters by 10 times interval in each parameter range.For instance, when tune β 4 , we choose β 4 = 1, 10, 100, 1000.Note that, the parameters cannot be automatically or adaptively chosen for different images.In this paper, we can have a good performance in the most of the situations.For most of the simulated experiments, we set the parameters with λ = 1, µ = 0.1, β 1 = 100, β 2 = β 3 = 10, β 4 = 1000.For the real experiments, most of the examples show the superior results with λ = 10, µ = 1 and β 1 = β 2 = β 3 = β 4 = 1.Note that, if fine tuning parameters based on this strategy for each image would get better results.However, we unify parameters to simplify the process of the parameter setting.

Limitation
In Table 5, we have shown the computed cost of two experiments.Actually, since the proposed method iteratively solves the 0 norm as sub-problems by the designed algorithm, the proposed method indeed increases the computational burden compared with others methods.In particular, some statistic-based methods have a very fast speed.For example, SLD method only takes 0.046 s (for the image size 300 × 300).Actually, it is not easy to achieve the trade-off between the fast speed and the high accuracy.The more time cost may be viewed as a limitation of the proposed method.

Conclusions
In this paper, we proposed a directionally non-convex 0 sparse model for remote sensing image destriping.We consider the directional and sparse characteristics of the stripe and the underlying image from local and global aspects.The non-convex 0 norm can be effectively solved iteratively rather than hard threshold.This model was efficiently solved by the designed PADMM algorithm based on the MPEC reformulation.Furthermore, we also theoretically gave the corresponding proof of the convergence to the KKT point by this work.Experimental results on simulated and real data demonstrated the effectiveness of the proposed method, both quantitatively and visually.In particular, the proposed method obtained promising quantitative results, e.g., PSNR and SSIM, than some recent competitive methods on the average performance of 32 collected images.The power spectrum and spatial mean cross-track profiles were also employed to illustrate the superiority of the given method.Besides, for the real images, we also employed some non-reference indexes such as MICV and MMRD to compare the performance of different compared methods, which also demonstrated the effectiveness of our method.
In the future, we will try to speed up the proposed method using C language or parallel computing, and release the corresponding source code.In addition, we will extend the proposed model to the oblique stripes removal by fully considering the latent properties of oblique stripes.Furthermore, the proposed method was only applied to single-band image stripe removal.We may extend our framework to multispectral or hyperspectral image stripe removal by some intrinsic properties, e.g., low-rank and non-local priors.

Figure 1 .
Figure 1.Framework of the proposed destriping method.

Figure 2 .
Figure 2. The number of nonzero elements of s (a) and ∇ y s (b), where s is estimated from a real image example (see Figure3) by the method[26].The y-axis represents the number of nonzero elements of s and ∇ y s, respectively, and the x-axis stands for the image pixel value whose range is [0, 1].

Figure 4 .
Figure 4.The zoom results of different simulated images in Figure 3. From top to bottom: zoom of the underlying images, the degraded images, the destriping results of WFAF, SLD, UTV, GSLV, LRSID, and Ours.Note that the levels of stripes are same as Figure 3.

Figure 5 .
Figure 5.The stripes s of different simulated images in Figure 3. From top to bottom: the added stripes on the underlying image, the extracted stripe components of WFAF, SLD, UTV, GSLV, LRSID, and Ours.Note that the levels of stripes are same as Figure 3.

Figure 7 .
Figure 7.The visual results of different real images.From top to bottom: the real images, the destriping results of WFAF, SLD, UTV, GSLV, LRSID, and Ours.Readers are recommended to zoom in all figures for better visibility.

Figure 8 .
Figure 8.The stripes s of different real images in Figure 7. From top to bottom: the extracted stripe components of WFAF, SLD, UTV, GSLV, LRSID, and Ours.

Table 1 .
The ReErr results between s added and s estimated , and the PSNR between the ground-truth and the computed image by the different methods.

Table 2 .
The mean value of PSNR and SSIM of 32 images with periodic noise.

Table 3 .
The mean value of PSNR and SSIM of 32 images with nonperiodic noise.

Table 4 .
The MICV and MMRD index results of the six real images.

Figure 13 .
The influence of different terms in the proposed model.R 12 represents R 1 + R 2 , R 23 stands for R 2 + R 3 and R 123 represents R 1 + R 2 + R 3 (i.e., the proposed model).

Table 5 .
The computational cost of the methods.