Hyperspectral Image Destriping and Denoising Using Stripe and Spectral Low-Rank Matrix Recovery and Global Spatial-Spectral Total Variation

: Hyperspectral image (HSI) is easily corrupted by different kinds of noise, such as stripes, dead pixels, impulse noise, Gaussian noise, etc. Due to less consideration of the structural speciﬁcity of stripes, many existing HSI denoising methods cannot effectively remove the heavy stripes in mixed noise. In this paper, we classify the noise on HSI into three types: sparse noise, stripe noise, and Gaussian noise. The clean image and different types of noise are treated as independent components. In this way, the image denoising task can be naturally regarded as an image decomposition problem. Thanks to the structural characteristic of stripes and the low-rank property of HSI, we propose to destripe and denoise the HSI by using stripe and spectral low-rank matrix recovery and combine it with the global spatial-spectral TV regularization (SSLR-SSTV). By considering different properties of different HSI ingredients, the proposed method separates the original image from the noise components perfectly. Both simulation and real image denoising experiments demonstrate that the proposed method can achieve a satisfactory denoising result compared with the state-of-the-art methods. Especially, it outperforms the other methods in the task of stripe noise removal visually and quantitatively.


Introduction
Hyperspectral image (HSI) data play an essential role in the field of remote sensing. Unlike natural images, it contains not only spatial information, but also rich spectral information. Therefore, it is widely used in urban planning, earth observation, agriculture, food safety, etc. [1,2]. However, HSI suffers from various noise types because of the unstable working environment, photon effects, instrument failure, etc. These noises degrade the quality of HSI and limit the performance of subsequent applications, e.g., classification, segmentation, change detection, and so on [3][4][5][6]. Therefore, mixed noise denoising becomes a crucial step for further analysis and applications of the HSIs.
Compared with natural images, HSIs contain mixed noises with different types, such as Gaussian noise, dead pixels/lines, sparse noise, stripe noise etc., which challenge the existing denoising methods. Besides, HSI noises are distributed both in the spatial domain and spectral domain, and the noise intensity of each band is different, thus making it difficult to estimate the noise of each band [7].
So far, a large number of HSI denoising methods have been proposed. These methods can be roughly divided into five groups. The most intuitive way is to regard each band of an HSI as a 2-D natural image. In this way, the classical natural image denoising methods can be used to restore the HSI data band-wise, for instance, the bilateral filter [8], the nonlocal-means algorithm [9], the block-matching 3-D filtering (BM3D) [10], etc. Unfortunately, artifacts and deformation will be brought into the denoising results because of two main drawbacks. The first one is that the natural image denoising methods always assume that the noise distribution is Gaussian, which is not fit for modeling the mixed noise. The other one is that the band-wise processing manner neglects the strong correlations between bands.
Another natural idea is to treat the HSI data as a multidimensional data cube. Then, the volumetric data denoising methods can be applied to denoise the HSIs, such as the 3D non-local means filter [11], BM4D [12], PCA-BM4D [13], etc. However, these methods do not take advantage of the correlation information through the spectral bands either, leading to unsatisfactory results. Therefore, it is necessary to consider the spatial and spectral information simultaneously to improve the denoising quality.
The third type of HSI denoising methods combines spatial and spectral information to achieve better denoising performance [14][15][16][17][18][19]. Fu et al. make use of the spectral correlation along the bands and the non-local spatial similarity to learn the adaptive dictionary for HSI denoising [16]. Zheng et al. adapt the denoising problem to a data fusion problem and take advantage of the noise-free bands to reconstruct the target noisy bands [18]. The total variation (TV) regularization is favored as an image denoising tool for its excellent denoising performance and edge-preserving property [20]. It is also popular in HSI denoising. In [14], Yuan et al. use the spectral-spatial adaptive TV operator as the smoothing prior of the image in the denoising model. This method can reduce the noise in the smooth area while retaining important edge and texture information. In [15], the authors applied the total variation (TV) to regularize the HSI data spatially and spectrally. Then, they fused the two results by a Q-weighted fusion algorithm. Compared with the results obtained from a single (either spatial or spectral) view, the spatial-spectral view fusion denoising is effective. Besides, some transform domain-based methods have also been proposed [21][22][23].
Low-rank-based methods are considered as the fourth type of common HSI denoising methods. The adjacent bands of an HSI have strong correlations, thus forming the low rankness of an HSI. In return, the low-rankness implies the low-dimensional structural characteristic of high-dimensional data, which is helpful for recovering data. In general, the low-rankness is always achieved by applying the low-rank matrix approximation [24][25][26], low-rank tensor decomposition [27][28][29], and low-rank constraint [30][31][32]. In [25], Zhang et al. take advantage of the low-rank property by regarding the noisy-free HSI data as a low-rank matrix. Then, they propose to remove the different kinds of noise at the same time based on the low-rank matrix recovery (LRMR). He et al. build a patchwise low-rank matrix approximation (LRMA) according to the low-rank property of the local 3D patches [26]. Based on the LRMA, they come up with a noise-adjusted iterative low-rank matrix approximation (NAILRMA) method. Their following work presents a TV-regularized low-rank (LRTV) matrix factorization denoising method in [33]. They use the nuclear norm, TV regularization, and L 1 norm together to formulate the restoration problem. In [34], Zheng et al. proposed a double-factor-regularized low-rank tensor factorization (LRTF-DFR) method for HSI denoising. They embed the group sparsity constraint of the spatial factor and correlation constraint of the spectral factor into the LRTF framework, and obtain a satisfactory denoising result.
More recently, many deep learning-based methods have been proposed and achieve superb results [35][36][37][38][39]. The performance of these supervised learning-based methods highly depends on the extensive noisy/clean HSI pairs and parameter selection. However, in practice, it is difficult and costly to acquire large amounts of clean HSIs in remote sensing, limiting the supervised learning methods.
In reality, it is not really necessary to separate the fourth type of HSI denoising methods from the third one. A large amount of work integrate the spatial and spectral information with the low-rank property to detach the clean HSI from the observed noisy image [29,30,32,33,40,41]. In [41], a local low-rank matrix recovery and global spatialspectral total variation model (LLRSSTV) was proposed by He et al. Instead of processing the local patches independently, as the previous low-rank-based HSI denoising methods do, the global spatial-spectral TV (SSTV) regularize all the local patches simultaneously and can ensure the global spatial-spectral smoothness of the estimated image. Besides, it helps to separate the sparse and Gaussian noise from the observed patches via the estimated clean patches.
Unfortunately, due to less consideration of the structural specificity of stripes, these methods cannot effectively remove the heavy stripe noise from mixed noise. To the best of our knowledge, most of the existing works on mixed noise denoising model the observed HSI data without considering the stripe noise as an independent component, resulting in unsatisfactory destriping performance. While for the destriping works, they mainly focus on the stripe noise removal but ignore other noise. Therefore, we are motivated to propose a destriping and denoising algorithm. In this paper, we propose a stripe-spectral low-rank (SSLR) matrix recovery and combine it with the global SSTV regularization method to restore the HSIs corrupted by various types of noises. The output of the proposed method includes three independent components that are completely decoupled, i.e., sparse noise, stripes, and the clean image.
The contributions of the proposed approach can be summarized as follows.

1.
Previous methods on denoising mixed noise usually classify stripes to sparse noise without considering its structural specificity, while in our paper, we treat the stripes as an independent component and take full advantage of its low-rank property, thus obtaining better destriping and denoising performance.

2.
A global spatial-spectral total variation (SSTV) regularization is combined with the stripe-spectral low-rank constraint (SSLR) in the proposed HSI denoising model to reconstruct the clean image, stripe, and sparse noise.

3.
The augmented Lagrange multiplier (ALM) algorithm is employed to solve the proposed SSLR-SSTV model. Both simulated and real data experimental results demonstrate that the proposed method improves the denoising results significantly when compared with the state-of-the-art techniques. Especially, the proposed method provide superb destriping performance with the stripe low-rank constraint.
The rest of the paper is organized as follows. Section 2 introduces the backgrounds and preliminaries on low-rank property of HSI and stripes and SSTV regularization. The proposed denoising model is described in Section 3. Section 4 is the experimental configurations. In Section 5, both simulated experiments and real data experiment are described and analyzed. We analyze the influence of selecting several important parameters in Section 6, and finally, conclusions are given in Section 7.

Low-Rank Property of HSI and Stripe Component
The spectral bands of HSI data are highly correlated and have abundant redundancy. This implies that a large number of spectra can be represented by linear low-dimensional models [28]. To illustrate this property, we extract a full-band image patch of size m × n × P from a HSI of size M × N × P, and convert the patch into a matrix of size mn × P. For analyzing the low-rank property over the bands, we decompose the matrix using the singular value decomposition (SVD), as shown in Figure 1a. The rank of full-band patch is much less than its size, i.e., r 1 P and r 1 mn. In terms of the stripe noise, it is different from other noise and has significant structural and directional characteristics. We quantitatively analyze the property of stripes in HSI. For a simulated stripe noise image, we decompose the stripe matrix using the SVD, as shown in Figure 1b. We can observe that the singular values of the stripe image rapidly decrease to zero with rank 1. In other words, the subspace of additive stripes can be well modeled by low-rank constraints. Therefore, we propose to use the low-rank constraint on the stripe directly.

Spatial-Spectral TV (SSTV) Regularization
TV-based techniques have been widely applied to image denoising since its emergence [20]. It helps to regularize the signals while preserving the important details. Without exception, it is one of the most powerful and popular tools for HSI noise removal [14,33,42,43]. For a 2D grayscale image I, the standard isotropic TV norm is defined as where D i and D j compute the first-order derivative along the horizontal and vertical direction of the input image, respectively. The · 1 is L1-norm and denotes the sum of absolute value of the matrix. Note that an HSI has three dimensions and strong correlations between bands. Therefore, the isotropic SSTV norm of an HSI X is defined as where D b acts the same role as D i and D j do, but along the spectral direction. As the gradient of different directions on HSI may be different, the istropic SSTV norm Equation (2) can be extended to an anisotropic version: where τ i , τ j , and τ b are the weighting coefficients in three directions.

Stripe-Spectral Low Rank and Spatial-Spectral TV Method
As mentioned before, an HSI contains different kinds of noises. In this paper, we consider the following model: where Y ∈ R M×N×P is the observed HSI, with M, N, and P the number of the rows, columns, and bands, respectively. In (4), X is the desired clean image, S corresponds to the additive stripe noise; N denotes the Gaussian noise; and B represents the sparse noise, including the impulse noise, dead pixels, or lines. Despite the fact that the noises are dependent in raw space, we do not know the prior knowledge about the correlation between different noises. Therefore, in this paper, we assume that the different noises are independently distributed. The aim of denoising is to recover X from Y.

Proposed Model
As we have analyzed in Section 2.1, the stripe component S occurs in random bands and has structurally repeated structure that can be described low-rank features. Therefore, we are inspired by using a low-rank constraint on the stripe. For recovering the clean image X from observed HSI Y, except for the widely used SSTV regularization, we also apply the spectral low-rank constraint as a regularization term. Finally, we use the sparse constraint on sparse noise component B. Therefore, our image reconstruction model is described by As the rank constraint rank(·) is non-convex, we adopt the nuclear norm · * to replace it, i.e., the sum of the singular values of matrix [44,45]. Therefore, the reconstruction model is reformulated as min X,S,B where λ, τ, and β are the coefficients that controls the sparsity regularizer of noise, SSTV regularizer, and the stripe low-rank regularizer, respectively. Moreover, r 1 and r 2 are the upper bounds of the low-rank matrix.
When solving the low-rank constraint of the above model, the size of each band X b of the HSI is elongated from M × N to MN × 1, so the size of X becomes MN × p, making X a very thin matrix. It is proved in [46] that the processing of a very thin matrix may cause parametric blurring because of the limited degrees-of-freedom. Therefore, the application of a global low-rank constraint to the HSI is unfavored. In order to solve this problem, in [41], the authors divide the HSI in local and full-band overlapping patches, thus better preserving the detail information. In this paper, we also apply the low-rank constraint to the local patches, refer to Figure 1a as an example. We obtain full-band patches from the 3-D image X and sparse noise B. For the stripe noise, we still use the global information. Based on this idea, the optimization problem (6) can be defined as where X ij and B ij are the clean image patch and sparse noise patch of m × n × p at location (i, j), respectively. The patch cube is extracted by a binary operator A ij : X → X ij . The proposed reconstruction model aims to optimize three variables-X, S, and Bsimultaneously, which can be solved via an alternatively minimizing strategy. Next, we will introduce the optimization procedure and solve the optimization problem (7) according to the augmented Lagrange multiplier (ALM) method.

Optimization Procedure
As mentioned above, we adopt the ALM to solve the minimization problem (7) by introducing three variables J, W ∈ R M×N×P and U ∈ R M×N×P×3 . The optimization problem (7) is then equivalent to The augmented Lagrangian function is described as follows: where ρ is the penalty coefficient, and I Y ij , I X ij , I J , I U , and I S b are the Lagrangian multipliers. By choosing proper initialization and regularization parameters, we can obtain a satisfactory result. The restoration process of ALM can be decomposed into six simpler subproblems, and their variables are updated in an alternate iterative manner.
(1) Optimization problem regarding X ij : The step of updating X ij is a typical low-rank matrix approximation problem, which can be easily solved by performing a soft threshold operation on the matrix singular value decomposition (SVD) [47]: where Σ r 1 ii is the diagonal element of the singular-value matrix Σ r 1 = diag(σ i (1 < i < r 1 )).
(2) Optimization problem regarding stripe noise S b : Given an image X, sparse noise B, and the optimization equation of striped noise, S b is as follows: Equation (12) can also be easily solved by performing soft threshold operation on the matrix singular values.
(3) Optimization problem regarding sparse noise B ij : Given an image X, stripe noise B, and the optimization equation of sparse noise, B ij is as follows: where shrink_L 1 is the soft-threshold operator of L1 norm [48], which is defined as (4) The J-related optimizing step: It is easy to see that (16) has a closed-form solution. The form of solution is as follows: where L represents a full-one tensor of size M × N × P, and ./ means dividing each element of the previous matrix by the corresponding element of the latter matrix.
(5) The W-related optimizing step: This problem can be effectively solved according to the fast Fourier transform (FFT) method [41]: where F and F −1 represent the Fast Fourier Transform (FFT) and the inverse FFT, respectively. D T stands for the adjoint operator of D.
The U-related optimizing step: where Y and U can be expressed as The optimization problem (20) can be solved by soft-thresholding (shrinkage) operator shrink_L 1 , which is the same as Equation (15): Then, the Lagrangian multipliers I Y ij , I X ij , I J , I U and I S b can be updated in parallel: The algorithm procedure is summarized in Algorithm 1.

Experimental Indicators and Settings
In the simulated experiments, we use the mean peak signal-to-noise ratio (MPSNR) [49,50], mean structural similarity (MSSIM) [51], and the mean spectral angle mapper (MSAM) [52] to evaluate the quality of the denoising results of each algorithm. PSNR and SSIM are two of the most widely used quality assessment indexes in the field of image processing and computer vision. The PSNR measures the quality of the restored image according to the mean square error, and the SSIM evaluates the similarity between the target image and the reference image. The higher the PSNR and SSIM are, the better the restored image is. SAM calculates the spectral similarity according to the angular difference between the spectrum vectors of the filtered and the noise-free HSI. The smaller SAM is, the more similar the restored image is to the original image. These three measurements are calculated as where I max,b is the maximum value of the image on the b th band.û b and u b denote the denoised and clean HSI in b th band, respectively. µ u b and µû b are average values of image u b andû b , and σ u b , σû b represent the the variances, while σ u bûb means the covariance between two images. In real HSI experiments, due to the lack of reference image, we use a no-reference HSI quality assessment (The MATLAB code was provided by Dr. Jingxiang Yang; Available: https://github.com/polwork/No-Reference-Hyperspectral-Image-Quality-Assessmentvia-Quality-Sensitive-Features-Learning (accessed on 11 January 2021)) [50] to evaluate the denoised images. The smaller the quality assessment score is, the better the denoising or restoring result is.
We set the penalty parameter ρ as 10 −2 . The patch size is initialized to 20 × 20, and the step length is 10 both in the horizontal and vertical directions, empirically. Finally, the stopping criterion ε and the iteration number IterMax are set to 10 −6 and 50, respectively. Before the denoising, we normalize the HSI data to [0, 1] by dividing the maximal value of the HSI data cube. The selection of parameters will be discussed in Section 6.

Simulation Configurations
Real HSIs are usually degraded by a mix of various noises. The goal of our algorithm is to destripe and denoise. To simulate a noisy image, we add Gaussian noise with different standard variance σ, salt-and-pepper impulse noise with different percentage o, and stripes with different percentages r and different intensities v to randomly selected 30% the bands of the two HSI datasets, as described in the following three cases.

Experimental Results
In this part, we conduct several simulated and real data experiments to verify the effectiveness of our method.

Simulated Data Experiments
In our proposed method, clean image components X are reconstructed through TV regularization terms and local low-rank constraints. The L 1 sparse regularization and the low-rank constraint help to separate the sparse noise B and stripe noise S from the HSI, respectively. Figure 2 illustrates the output independent components of our method in case 1 with σ = 0.05, o = 0.1, v = 0.075, and r = 0.3. Figure 2a presents the observed image Y, Figure 2b is the denoised image X, and Figure 2c,d shows the output sparse noise B and the output stripe noise S, respectively. Figures 3 and 4 present the denoising results of the different methods for the two simulated data sets in Case 1 with r = 0.5, v = 0.075. Figure 3 shows the denoising results of Pavia city center data in band 11 and its zoom-in image. Figure 4 presents the denoising results of Washington DC Mall data in band 3 and its zoom-in image. Figures 3a and 4a present the original Pavia City Center data and Washington DC Mall data, respectively. Figures 3b and 4b show the image after adding the simulated mixed noise. Figures 3c-i and 4c-i display the denoising results of the different methods. By visually comparing the denoising results of these methods, it can be seen clearly that the proposed method outperforms the other methods and can effectively remove the mixed noise. From Figure 4c-i, we can observe that BM4D fails in removing sparse noise and stripe noise because it does not consider the correlation between bands. NAILRMA does not behave well because it only considers the Gaussian distribution of noise. LRMR, LLRSSTV, LRTV, and LRTF-DFR can effectively remove the sparse noise and Gaussian noise, but the stripe noise is not well filtered. Figures 3 and 4 demonstrate that the proposed SSLR-SSTV can best preserve local details and remove high intensity of stripe noises in the mixed noise. Figure 5 shows the PSNR and SSIM values of each band of the Pavia City Center images and Washington DC Mall in case 1, respectively. As presented in Figure 5, the proposed method achieves the best PSNR and SSIM values in most bands of the image. In addition, we can see that the PSNR and SSIM of our method are smooth between bands, while other methods have relatively large differences between each band, which forms sawtooth. Actually, the sawtooth is generated at bands where there are stripe noise. Figure 5 proves that our method has better destriping performance than the other methods. The above experiments demonstrate the superiority of the proposed method, especially its ability of destriping in mixed noise.
The quantitative evaluation results of the different denoising methods with the simulated noise in cases 1 and 2 for the Washington DC Mall data and the Pavia City Center data are shown in Table 1. We use bold to mark the best result for each quality index in the same case. As is shown in Table 1, our method achieves the highest MPSNR, the largest MSSIM, and the smallest MSAM in most cases when compared to the other methods. This result proves the advantage of our method in removing mixed noise from HSI.
Besides, we made rough comparisons with three deep learning-based methods [35][36][37]. Given the input images with the same noise level, our method can yield comparable or even better denoising results according to the evaluation indexes. However, we cannot conclude that our method is better than the deep learning-based methods, as those deep learning-based methods need proper training datasets, hyperparameter selection, etc., which are not accessible in our experiments. In this paper, we focus on developing a model-based HSI denoising method to quickly obtain accurate results without supervising information and training data.     Table 2 shows the destriping results for both Pavia City and Washington DC datasets with the simulated stripe noise in cases 3. In general, our method has higher MPSNR and MSSIM values in most cases compared with the LRID method. This phenomenon shows that our method is competitive in restoring images corrupted by stripe noise. We show the destriping results of the Pavia city and Washington DC Mall in Figures 6 and 7, respectively.

Real HSI Noise Experiments
In this section, we evaluate the performance of the proposed method on two real HSI data sets: EO Hyperion Dataset and HYDICE Urban Dataset. Figures 8 and 9 show the denoising results obtained in the EO Hyperion data of bands 132 and Urban dataset of bands 206, respectively. From Figures 8a-h and 9a-h, we can see that the BM4D method fails to denoise the mixed noise. As presented in Figures 8 and 9c,d, the NAILRMA and LRMR can remove some noise, but there is still obvious noise surviving. In Figures 8e and 9e, LRTV has oversmoothed the image and distorted the edge. By comparison, our method, LLRSSTV, and LRTF-DFR achieve a better restoration result visually than other compared methods. In Figure 8e,g, LLRSSTV is slightly better than our method in maintaining details. Table 3 shows the blind HSI quality assessment of different methods on real data. As shown in this table, our method achieves the lowest scores on two real HSI datasets. This result further demonstrates the superiority and robustness of our method.

Discussion
Parameters play significant roles in controlling the denoising performance. In our method, the parameter λ controls the regularization of the sparse noise B, β is the parameter for stripe noise S, and τ adjusts the spatial-spectral smoothness of the reconstructed X.
In the experiments, we adjust these parameters to achieve the highest PSNR value.
First, we analyze the influence of the parameters τ, λ and β on the denoising results with the simulated noise in cases 1 and 2 for the two simulated datasets. Figure 10 shows the MPSNR values of SSLR-SSTV on the Pavia dataset and Washington DC Mall data, as related to parameters τ, λ, and β. It can be seen from these figures that the optimal parameters of our method are τ = 0.03, λ = 0.3 and β = 1, and the result is robust under different noise cases and different images.
Second, we investigate the effect of the upper bound rank r 1 and r 2 . We select r 1 and r 2 from [1,6], and fix the parameters as τ = 0.03, λ = 0.3 and β = 1. It is clear that the denoised results are best when r 1 = 2 and r 2 = 1 in Figure 11. This is in accordance with our analysis in Section 2.1. During the simulation experiment, we find some bias between the mean value of the filtered result and the original image, i.e., the filtered image is visually darker than the original image. This phenomenon is because our algorithm regards horizontal and vertical components of the clean image as stripes. Numerous experiments have shown that the difference between the mean value of the filtered image and the original image is close to the mean value of the stripes component. Therefore, we compensate for the filtered result by adding the difference and achieve an excellent visual effect.
Furthermore, we compare the running time with the other methods. Table 4 shows the average processing time of these methods on simulated data and on real data. All of the compared algorithms are running on Matlab R2019a (Intel i7 CPU at 3.60 GHz, 16 GB of memory). Note that our degradation model contains four components, and the optimization process may be a little more costly than the other methods, but our method can provide better denoising and destriping results.

Conclusions
Focusing on the destriping and denoising problem of HSI corrupted by mixed noise, we proposed a stripe and spectral low-rank matrix recovery and global spatial-spectral total variation (SSLR-SSTV) method in this paper. In our model, the noises in HSI are classified into sparse noise, stripe noise, and Gaussian noise. A global spatial-spectral TV regularization and local low-rank constraint is utilized to remove the Gaussian noise and reconstruct the clean image. Considering the significant structural and directional characteristic of stripes, we use the low-rank constraint on the stripe noise of each band. In addition, we take advantage of the sparse property to remove the sparse noise. The designed restoration model is efficiently solved by the augmented Lagrange multiplier algorithm. Experiments on both simulated and real HSI datasets show that our method has the ability to accurately suppress noise and keep image details. Our method outperforms the state-of-the-art methods both in visual quality and evaluation criteria, especially when the stripe noise is strong in the mixed noise.