Robust Superpixel Segmentation for Hyperspectral-Image Restoration

Hyperspectral-image (HSI) restoration plays an essential role in remote sensing image processing. Recently, superpixel segmentation-based the low-rank regularized methods for HSI restoration have shown outstanding performance. However, most of them simply segment the HSI according to its first principal component, which is suboptimal. In this paper, integrating the superpixel segmentation with principal component analysis, we propose a robust superpixel segmentation strategy to better divide the HSI, which can further enhance the low-rank attribute of the HSI. To better employ the low-rank attribute, the weighted nuclear norm by three types of weighting is proposed to efficiently remove the mixed noise in degraded HSI. Experiments conducted on simulated and real HSI data verify the performance of the proposed method for HSI restoration.


Introduction
Hyperspectral images (HSIs) reflect the spectrum image of a real scene in hundreds of continuous wavebands. Thus, HSIs contain abundant spectral information and have been applied in various fields, such as terrain exploration, environmental monitoring, and military surveillance. Unfortunately, due to the limitation of devices and environmental factors, HSIs are often corrupted by various kinds of noise during data acquisition and transmission. Therefore, HSI restoration is a significant pretreatment, and affects the performance of the subsequent applications, including classification [1][2][3], unmixing [4][5][6], object detection [7,8], and super-resolution [9,10].
In the recent decade, based on the spatial-spectral prior knowledge of HSI, numerous HSI restoration methods have been proposed [11][12][13][14][15]. For instance, paper [11] presented a spectral-spatial adaptive total variation denoising strategy. Article [12] displayed a spectralspatial structured sparse representation in the HSI restoration model. The group sparse nonnegative matrix factorization (GSNMF) method for HSI restoration was proposed in [13]. With the group sparse regularization term, spectral signatures shared a common set of bases for HSI reconstruction in [14].
The strong correlation of adjacent bands and adjacent pixels ensures that the clean HSI has an underlying low-rank character. Inspired by this prior knowledge, many low-rank matrix or tensor-based methods have been proposed for HSI denoising. For instance, the low-rank matrix recovery (LRMR) approach [16] reformulated the cubic HSI data into a low-rank matrix and adopted the GoDec algorithm to solve the restored problem. The weighted nuclear norm and total variation (WNNTV)-based HSI restoration approach [17] was proposed to remove the hybrid noises of sparse noise and Gaussian noise. The multichannel weighted nuclear norm minimization (MCWNNM) algorithm [18] considered the low-rank prior both across the spectra and in the spatial domain to well restore the HSI. It can be seen that the weighted nuclear norm for characterizing low-rank prior knowledge of the HSI is a powerful tool on mixed-noise removal.

•
Combining the superpixel segmentation with principal component analysis, we propose a robust superpixel segmentation strategy to better divide the HSI. A "first small jump" scheme in the robust superpixel segmentation is devised to select the PCs. The selected PCs include most of the important information of the HSI. The superpixel segmentation result based on the selected PCs is relatively accurate. • The weighted nuclear norm is used to characterize the low-rank attribute of superpixel fibers. In particular, we summarize the weighted nuclear norm by three types of weighting in the HSI restoration model. • The alternating direction method (ADM) is used to solve the weighted-nuclear-normbased HSI restoration model, and the corresponding iterative optimization process is derived. We adopt three operators in one frame to solve the subproblem with the weighted nuclear norm. The experimental results obtained by different operators are displayed.
The remainder of this paper is organized as follows. The robust superpixel segmentation strategy and the proposed model for HSI restoration are introduced in Section 2. Section 3 presents the algorithm for solving the proposed model. Extensive experiment results are reported in Section 4. Section 5 shows the discussion of the proposed approach. Finally, Section 6 concludes this paper.

Robust Superpixel Segmentation
Superpixel fiber is a cluster of similar pixels along the spectral dimension. Compared with the square fiber in cubic HSI, superpixel fibers with an adaptive shape can better hold the edges and details of the HSI, which guarantees that more spectral-spatial features are extracted. In other words, superpixel segmentation can further enhance the low-rank properties of the HSI. Many HSI denoising methods based on superpixel segmentation simply segment the HSI according to its first PC [26,29,30]. It is suboptimal since the first PC lacks some details and edges of the HSI. In Figure 2, taking a noisy HSI (Indian Pines) as an example, we, respectively, present the ERS results based on the first PC and top six PCs. Obviously, the segmentation result based on the top six PCs is more accurate than the one based on the first PC. Since each PC of the degraded HSI has a significant amount of noise, selecting too many PCs is not good. A vital problem is how many PCs are appropriate for the good HSI segmentation result. Before solving the problem, we present the variance contribution rate of the kth PC of the HSI y ∈ R m×n×p as follows: where i denotes the ith eigenvalue of the covariance matrix of matrix A ∈ R mn×p , in which each column is the vectorized band of y (see Figure 1). Then cumulative variance contribution rate (CVCR) of top k PCs is defined as Obviously, both the values ofr k and r k are between 0 and 1. With the increasing k,r k becomes smaller, but r k becomes larger and more closed 1 . Larger r k indicates that the top k PCs hold more information of the HSI.
In statistics, the CVCR of selected PCs must reach more than 0.8. However, this is impracticable in HSI processing. Figure 3 shows the CVCR of PCs of clean and noisy Indian Pines data. For a clean Indian Pines data, the CVCR of the first PC is 0.61, but the CVCR of the top two PCs is 0.92, which means top two PCs contain almost all information of Indian Pines data. For the noisy Indian Pines data, the CVCR of the first PC is 0.41 and the CVCR of top six PCs is 0.65, which suggests that more PCs are needed to explain the main information of Indian Pines data.  To solve the vital problem above, referring to "first significant jump" scheme [31,32], we propose a "first small jump" scheme to choose some PCs. The "first small jump" scheme looks for the smallest k such that where τ is a data-dependent value to detect the first small jump of the sequence r k . The "jump" denotes the forward difference of the sequence r k , in the left side of inequality (3). The scheme aims to find out the transition point of curve in Figure 3. According to the scheme (3), the selected top k PCs can explain the significant information of the HSI. For instance, in Figure 4 the value |r 7 | − |r 6 | is firstly less than τ; then, top six PCs are selected. The selected top six PCs hold the larger CVCR than the first PC, and the other PCs that fail to provide useful information of the HSI are discarded .
The "first small jump" scheme is an adaptive method since the value τ is datadependent rather than fixed. We simply set τ = p −1 ||r|| 1 , where p is the band number of the HSI and r = [r 1 , . . . , r k , . . . ] is a vector whose elements are the CVCR of the PCs of the HSI. For different HSI data, τ is different. Based on the selected PCs, the HSI is accurately divided along the spectral dimension by the superpixel segmentation method. The precise segmentation will make the obtained superpixel fibers have intensely low-rank properties. By considering the low-rank property of each superpixel fiber, we propose a HSI restoration model in the next section.

The HSI Restoration Model Based on Low-Rank Superpixel Fiber
The general HSI restoration model is formulated as follows: where y ∈ R m×n×p and x ∈ R m×n×p are, respectively, denoted as the degraded and clean HSI; e stands for the mixture of strips, impulse noise, and dead lines; and n stands for the Gaussian noise. After robust superpixel segmentation, the degraded HSI is divided into K nonoverlapping superpixel fibers, as shown in Figure 1. Each superpixel fiber is an irregular tensor. Here, we unfold each superpixel fiber to a matrix for easy calculating. For the ith superpixel fiber, the restoration model is where Y i and X i denote the two matrices, in which each column is the vectorized band of y and x, respectively. E i denotes the reshaped mixed noise. N i denotes the reshaped Gaussian noise. The clean x is finally obtained by merging all restored X i . For convenience, the subscripts of the above formulation are temporarily omitted as follows: Each divided superpixel fiber contains the highly similar pixels and holds the high correlation across the spectra, so matrix X has intensely low-rank properties. Only some parts or bands of the HSI are contaminated by stripes, impulse noise, and dead lines; thus, matrix E is sparse [26]. The model (6) can be addressed by the following optimization problem: where λ is a regularized parameter. However, it is a NP-hard problem and cannot be solved directly.
There have been many articles that have confirmed that the weighted nuclear norm is a powerful tool in HSI restoration. We also adopt the weighted nuclear norm to explore the low-rank attributes of the superpixel fibers by displacing the rank function. Replacing the 0 norm by the 1 norm, the proposed model for superpixel fiber restoration is as follows: where is the weighted nuclear norm [33] of matrix X; σ i denotes the ith singular value of the matrix X, w = [w 1 , . . . , w p ]; and w i is the weight assigned to σ i (X). Noteworthily, we adopt three types of weighting in our model.
where N is the target rank of matrix X [34].
In the proposed model (8), the weighted nuclear norm ||X|| w, * characterizes the lowrank property of superpixel fiber and is a powerful tool to remove mixed noise. The 1 norm of matrix E characterizes the sparse property of noise in the HSI. We will employ an iterative algorithm to alternately solve E and X. Finally, we match X back to the superpixel fiber and integrate all of the denoised superpixel fibers into the HSI x.
The Lagrange function in the ADM frame is where Z is the Lagrange multiplier and µ is a positive scalar. The minimization of the problem (8) can be optimized by alternately updating the variables E and X; the corresponding subproblems are as follows: For three types of weighting, the subproblem (11) can be solved by following three operators.
where I is an identity matrix with the same size of Σ. D 2 µ −1 is the operator to solve standard nuclear norm. where D 3 N,µ −1 is the partial singular value thresholding (PSVT) operator [34]. The subproblem (12) is convex and can be easily solved as where S λµ −1 (α) = sign(α) max (|α| − λµ −1 , 0) is the soft-thresholding operator. The whole solution process is summarized in Algorithm 1.

Algorithm 1:
The HSI restoration algorithm based on low-rank superpixel fiber Input: the observed HSI y, the number of superpixel fibers K, the target rank N; Output: the restored HSI x and the sparse noise e; 1. Divide y by robust superpixel segmentation into K superpixel fibers and vectorize them as K matrices Y 1 , · · · , Y K , like Figure 1; Merge allX i into the restored HSI x.

Experiments
To evaluate the performance of the proposed method, both simulated and real data experiments are conducted. The tensor dictionary learning (TDL) [36], LRTV [37], SSLRR [26], and E3DTV [38] methods are utilized to compare with the proposed approach. All parameters involved in the competing methods are chosen as described in the reference papers. In the proposed approach, we set λ = 1 √ max(p,q) and µ = ( √ p + √ q)δ, where p and q denote the column number and row number of the matrix X, and δ is the deviation of Gaussian noise. Due to the shape of each superpixel fiber is different, the value of q is different.
Algorithm 1 typically takes far less iterations to satisfy the condition The proposed method based on the standard nuclear norm (solved by D 2 µ −1 operator) is unconsidered in experiments. As is known, its denoising performance is poorer than the one of the weighted nuclear norm. The ERS method is applied in robust superpixel segmentation to divide the noisy HSI in following experiments. The usual image quality metrics MPSNR, MSSIM, ERGAS [39], and runtime are adopted to evaluate the restoration results. The bigger MPSNR and MSSIM values, the better the denoising effect. The smaller the ERGAS value, the closer the restored HSI is to the original HSI .

Simulation Experiment
The public hyperspectral and multispectral image data are used to demonstrate the performance of the proposed method, i.e., Indian Pines (145 × 145 pixels and 224 spectral bands) and pompoms (512 × 512 pixels and 31 spectral bands). The same percentage of salt-and-pepper impulse noise and the same distribution of zero-mean Gaussian noise is added into each band. In addition, dead lines are added into some bands, as in the following two cases. In simulated experiments, the number of superpixel fibers K and the target rank N are vital parameters in the proposed method. For two cases above, Figure 5 shows the relationship between MPSNR, the ERGAS values, and the superpixel fibers' number K, and the relationship between the MPSNR value and the target rank N. We can see that when K = 34 and N = 1, the MPSNR values are biggest, and the ERGAS value is the smallest, i.e., the proposed method achieves the best results for Indian Pines data. Thus, we set K = 34 in the proposed method based on the WSVT operator, and K = 34 and N = 1 in the proposed method based on PSSV operator. In the same way, K = 16 and N = 2 are set in the proposed method for restoring pompoms data.
Noteworthily, the ERGAS value is very large when K = 1 (no superpixel segmentation). This means that only considering low-rank prior without superpixel segmentation will not adequately restore the HSI. With increasing K, the MPSNR value gradually becomes big and the ERGAS value becomes small, which suggests that robust superpixel segmentation works, and combined with the weighted nuclear norm it can obtain better HSI restoration results. However, too many superpixel fibers (large K) may lead to biased segmentation. Thus, when K is bigger, the ERGAS value starts to grow, i.e., the denoising performance starts to decline. Figures 6 and 7 present the restoration results of all test methods on Indian Pines data and pompoms data. Figures 8 and 9, respectively, display the enlarged images of the red box in Figures 6 and 7. In Figure 6, all compared methods can remove a significant amount of mixed noise. However, some boundaries in the restored band obtained by TDL are blurred, and some edges in the restored band obtained by LRTV are lost. The restored band of SSLRR still contains a litter noise. The restored band of the proposed approach based on WSVT operator is not smooth. In Figure 8, while E3DTV removes almost all of the noise, the proposed approach based on the PSVT operator retains more details and edges, and its restored band is closer to the original band.   In Figure 7, TDL fails to remove mixed noise. The restored band obtained by LRTV is blurred. E3DTV wipes off the mixed noise well but also smooth some edges. Visually, the restored bands achieved by the proposed method are clearer than the one of SSLRR, which confirms that the robust superpixel segmentation is better for HSI restoration than superpixel segmentation just using the first PC. In Figure 9, we can see that the proposed method not only removes the mixed noise but also recovers the vivid images. The statistical data reporting the performance of all test methods are shown in Tables 1 and 2. The highest MPSNR value and MSSIM value, the lowest ERGAS value, and the least time are highlighted in bold. From two tables, we can see that the proposed method based on the PSVT operator with the highest MPSNR and MSSIM values outperforms other methods, which is consistent with the Figures 6 and 7. This demonstrates that robust superpixel segmentation combined with the weighted nuclear norm achieves the best performance on mixed-noise removal.  Figure 10 and Figure 11, respectively, display the horizontal mean profiles of Indian Pines data in case 1 and pompoms data in case 2. The horizontal axis and the vertical axis refer to the column number of the HSI and the mean values of each column, respectively. The red curve stands for the value of the original and clean HSI, and the blue curve stands for the value of the restored HSI. We observe that both for Indian Pines data and pompoms data, the proposed method based on the PSVT operator presents a smooth curve close to the original one , meaning that the mixed noise can be effectively suppressed. Figure 12 and Figure 13 respectively illustrate the spectral signature of pixel (10,10) in Indian Pines data and pixel (100, 100) in pompoms data. The rapid fluctuations of the blue curve indicate the existence of mixed noise. In Figure 12c, it can be clearly observed that the blue curve is very matched by the red one, which means that the restored pixel by the proposed method based on the PSVT operator is nearly the same as the pixel of the original Indian Pines data. Unfortunately, not all pixels can be recovered precisely by the proposed method, such as the pixel (100, 100) in the pompoms data.

Real Data Experiment
Urban data and Botswanna data are popular in the HSI denoising experiments [40,41]. We also use them in real data experiments. Only SSLRR and E3DTV are employed as comparison methods, due to the performances of LRTV and TDL being poor in simulated experiments. In the proposed method, we set K = 12, N = 2 for urban data and K = 4, N = 2 for Botswanna data.
Urban data has 307 × 307 pixels and 210 spectral bands ranging from 400 nm to 2500 nm. It contains strong noise in some bands, such as deadlines, stripes, Gaussian noise, and other unknown noise types. Figure 14a shows the band 87 and band 156 of the urban data. We observed that band 87 is contaminated by several horizontal lines, and band 156 is corrupted by stripes and Gaussian-like random noise.
Botswanna data was collected in 2001 across Okavango Delta, Botswana (BOT) and has a size of 1476 × 256 pixels and 242 spectral bands ranging from 357 to 2576 nm [34]. We use a subimage of size 256 × 256 × 145, where some noisy and water absorption bands are discarded. Figure 15a illustrates the band 68 and band 110 of Botswanna data. We observed that band 68 and band 110 are heavily corrupted by water absorption and unknown noise, which make the denoising task more challenging.  Figure 14 shows the restoration results of four HSI denoising methods on urban data. Figure 16 presents the enlarged images of the red box in Figure 14. We observe that SSLRR fails to remove the horizontal lines in band 87 although it suppresses most of the noise in Figure 16. E3DTV discards some details and makes some edges smooth in restored bands 87 and 156. The proposed strategy presents remarkable ability regarding mixed-noise removal and restores the clear and vivid bands.
In Figure 15, restored bands 68 and bands 110 obtained by four HSI denoising methods on Botswana data are displayed. SSLRR suppresses some of the noise but cannot remove it completely. E3DTV works well for noise removal, but the detailed spatial information is lost. The proposed approach efficiently removes the unknown noise and better preserves the details, as shown in the red box in Figure 15e.

Discussion
There are four test data in our experiments, i.e., noisy Indian Pines, noisy pompoms, and urban and Botswanna data. Figure 17 shows the CVCR of PCs of four test data. According to the "first small jump" scheme, we, respectively, select top five PCs for noisy Indian Pines data and top three PCs for noisy pompoms data in simulated experiments, the top five PCs for the urban data, and the top five PCs for the Botswanna data in real data experiments. It is suggested that value τ in "first small jump" scheme is data-dependent rather than fixed.
In particular, in Figure 17c, the CVCR of the top five PCs is 0.9007, which means that the top five PCs can fully explain the urban data. Likewise, as shown in Figure 17d, the CVCR of the top five PCs is 0.9922, and they also fully explain the Botswanna data. It is suggested that the "first small jump" scheme is reasonable and feasible, especially for real HSI data.  For the proposed approach based on PSVT operator, Table 3 reports the relationship between the number of PC and MPSNR values when recovering the noisy Indian Pines and pompoms data, respectively. Obviously, when just selecting the first PC, the MPSNR values are low for two noisy data. When selecting top five PCs for noisy Indian Pines data and top three PCs for noisy pompoms data, the proposed method performs best with the highest MPSNR value. This demonstrates that robust superpixel segmentation combined with the weighted nuclear norm obtained the best performance. Since the weighted nuclear norm is non-convex, the convergence of the proposed method is hard to proof in theory. However, the proposed method based on the WSVT operator can obtain a fixed-point solution if the weights w are in a non-descending order [42]. Empirically, the proposed method based on the PSVT operator is convergent when restoring each superpixel fiber. Figure 18 shows the relationship of the iteration numbers and the MPSNR value obtained by the proposed method based on the PSVT operator for any two superpixel fibers' restoration. It can be seen that the proposed method based on the PSVT operator is fast convergent.  Figure 18. Iteration numbers vs. MPSNR for any two superpixel fibers restoration.

Conclusions
Since the accurate superpixel segmentation result will enhance the low-rank property of the HSI, in this paper, we have proposed a robust superpixel segmentation strategy using the "first small jump" scheme to more precisely divide the HSI. Then, the weighted nuclear norm is employed to characterize the low-rank property of the superpixel fibers. We have summarized the weighted nuclear norm by three types of weighting, and adopted the WSVT operator, PSVT operator, and soft-thresholding operator to solve the proposed HSI restoration model in the ADM frame. The extensive experiments on simulated and real HSIs demonstrate that our approach performed best, in both visual and quantitative assessments.