Polarimetric Contextual Classification of PolSAR Images Using Sparse Representation and Superpixels

In recent years, sparse representation-based techniques have shown great potential for pattern recognition problems. In this paper, the problem of polarimetric synthetic aperture radar (PolSAR) image classification is investigated using sparse representation-based classifiers (SRCs). We propose to take advantage of both polarimetric information and contextual information by combining sparsity-based classification methods with the concept of superpixels. Based on polarimetric feature vectors constructed by stacking a variety of polarimetric signatures and a superpixel map, two strategies are considered to perform polarimetric-contextual classification of PolSAR images. The first strategy starts by classifying the PolSAR image with pixel-wise SRC. Then, spatial regularization is imposed on the pixel-wise classification map by using majority voting within superpixels. In the second strategy, the PolSAR image is classified by taking superpixels as processing elements. The joint sparse representation-based classifier (JSRC) is employed to combine the polarimetric information contained in feature vectors and the contextual information provided by superpixels. Experimental results on real PolSAR datasets demonstrate the feasibility of the proposed approaches. It is proven that the classification performance is improved by using contextual information. A comparison with several other approaches also verifies the effectiveness of the proposed approach.

pixel within the neighborhood selection window. Pixels similar to the center pixel are assigned to large weights, and pixels dissimilar to the center pixel are assigned to small weights. Nevertheless, the computational complexity is inevitably increased. Moreover, for both approaches in [44,45], the size of the neighboring window should also be decided.
To overcome the problems of the rigid square neighborhood and high computational complexity, we propose to take advantage of contextual information in sparse representation-based PolSAR image classification by using superpixels. The concept of superpixel comes from the field of computer vision [46]. It refers to small homogenous image patches that can be considered as pure elements for classification. Using superpixels has been popular for scene classification in computer vision [47][48][49]. For remotely sensed image classification, using superpixels has also drawn much attention [50][51][52][53]. In this paper, we first propose a modification of the simple linear iterative clustering (SLIC) method [54] for superpixel generation in PolSAR images. The SLIC method is chosen, because it produces high quality superpixels and is simple to implement. As the original SLIC method is developed for grayscale/color images, it is modified to deal with PolSAR images by using a statistical model-based distance function. After superpixels have been obtained, they are used to define adaptive neighborhoods for incorporating contextual information within the sparse representation-based classification framework. The first strategy is to impose a spatial regularization on the pixel-wise classification map obtained by SRC. The second strategy is to take advantage of a joint sparsity model for pixels within the same superpixel. The advantage of using superpixels is two-fold. On the one hand, superpixels provide an adaptive neighborhood for each pixel; thus, the problem of including pixels from different classes when using a rigid square local window can be avoided. On the other hand, the computational burden can be effectively reduced. Experimental results with real PolSAR datasets validate the effectiveness of the proposed approaches.
The rest of the paper is organized as follows. Section 2 describes the approach of using SRC for PolSAR image classification. In Section 3, the contextual information is included for performing polarimetric-contextual classification of PolSAR images. Experimental results are presented in Section 4, and lastly, conclusions and future works are given in Section 5.

PolSAR Image Classification with SRC
This section presents the approach for the pixel-wise classification of PolSAR images with SRC, which consists of two ingredients: polarimetric feature extraction and sparse classification.

Polarimetric Feature Extraction
For remotely sensed image classification, informative features should be extracted to distinguish pixels of different land cover types. As mentioned in the Introduction, in this paper, the feature space for PolSAR image classification is constructed by stacking a variety of polarimetric signatures.
The ground information of PolSAR images is contained in the polarimetric measurements, i.e., scattering matrices or covariance/coherency matrices. To extract features from the measured PolSAR data, simple mathematical operations, such as absolution, summation, difference and ratio, can be used. Examples of this category of polarimetric signatures are single-channel backscattering intensities, intensity ratios, correlation coefficients, degree of polarization, etc. [55]. Although being simple to compute, those parameters are often computed with partial polarimetric measurements and, thus, can only describe a specific aspect of the ground object scattering property. We can also extract polarimetric scattering parameters with various polarimetric target decomposition methods [9]. In past decades, many polarimetric target decomposition methods have been proposed, aiming to identify ground scattering mechanisms through matrix decomposition techniques. A review of target decomposition methods is available in [1]. Generally, different target decomposition methods try to interpret PolSAR data from different perspectives. Nevertheless, there is no guidance to decide which target decomposition method will lead to the most accurate classification result for a given PolSAR image.
For a specific type of polarimetric feature extraction method, the number of extracted signatures is often small. Therefore, only partial polarimetric information is preserved. To construct a feature space with comprehensive polarimetric information, various polarimetric signatures can be stacked to form a high-dimensional feature vector at each pixel. In this paper, 42 polarimetric features are extracted from the original PolSAR data [29]. Features extracted with simple mathematical operations include the backscattering coefficients of different polarization channels, three polarized ratios, three backscattering coefficient ratios, one phase difference, the depolarization ratio and the degree of polarization. Features extracted with target decomposition methods include three parameters of the Pauli decomposition, six parameters of the Krogager decomposition, six parameters of the Cloude decomposition, six parameters of the Freeman-Durden decomposition and nine parameters of the Huynen decomposition. Although the dimensionality of such a feature space is larger than the freedom degree of the original data (six for the scattering matrix and nine for the covariance/coherency matrix in the mono-static case), it is still reasonable to construct such a redundant high-dimensional feature space, since it is difficult to design a compact optimal feature space directly. Then, the effective discriminative information contained in the redundant representation is exploited with sparse representation-based classifiers.

Sparse Representation-Based Classification of PolSAR Images
Sparse representation has been established as a powerful tool in the pattern recognition field. Wright et al. [33] proposed the SRC in the context of face recognition. The underlying assumption of SRC is that each test sample can be represented by a linear combination of a few atoms from an overcomplete dictionary. Since many atoms are not used for representing the test sample, the coefficients corresponding to those un-used atoms are zeros, leading the coefficient vector to be sparse. Let F={f 1 ,f 2 ,…,f N }∈ N×M be the representation of the PolSAR image in the feature space, where f k , 1 k N is the feature vector of the k-th pixel, N is the number of all pixels and M is the dimensionality of the feature space. Suppose that a training dictionary is available, which is denoted by D={d 1 ,d 2 ,…,d N T } ∈  N T ×M , with N T samples and C distinct classes. In this paper, the dictionary is constructed by collecting feature samples of labeled pixels in the PolSAR image of interest. The dictionary can be arranged in a form as D=[D 1 ,D 2 ,…,D C ], in which D c ∈  N T c ×M is the sub-dictionary of the c-th class with N T c samples. Therefore, we have . It is assumed that a test polarimetric feature sample f k can be represented with the given dictionary as follows [33]: In Equation (1), w=[w 1 T ,w 2 T ,…,w C T ] ∈  N T ×1 is the sparse combination weight vector, in which w c T ∈ R N T c ×1 contains the weights corresponding to atoms of the c-th class. The key observation of SRC is that if the test sample f k comes from the c-th class, then it can be well approximated by atoms from the same class. Therefore, the principle of SRC is to find the optimal weight vector in Equation (1) and assign the test sample to the class with the minimum approximation error.
To obtain the sparse representation of the test sample f k , the sparse weight vector w that satisfies (1) should be solved. This leads to the following optimization problem: where the l 0 -norm || ⋅ || 0 counts the non-zero elements in w. By solving the above optimization problem, we will find a weight vector with a minimum number of non-zero elements, while ensuring that the test sample is approximately represented by atoms from the dictionary. In Equation (2), the rough constraint Dw f k is used instead of an exact constraint, so that the influence of noise and model errors can be accounted for. The problem of Equation (2) can be made more precise by using a clear requirement on the approximation accuracy, i.e., to set a bound on the approximation error. As a result, the optimization problem in Equation (2) becomes: where ε is the bound of the approximation error. Although the l 0 optimization problem is NP-hard, it can be approximately solved with greedy algorithms, such as orthogonal matching pursuit (OMP) [56].
To facilitate numerical computation, it is proposed to replace the l 0 -norm with the l 1 -norm. It is shown that if the solution of Equation (3) is sparse enough and the dictionary D is incoherent with the basis under which the solution is sparse, it can be recovered by solving the following l 1 -minimization problem, as well [57,58]: 1 2 arg min || || . . || || The sparsity encouraging property of the l 1 -norm has been studied in the field of compressive sensing. The problem in Equation (4) is known as the basis pursuit denoising (BPDN) [59]. Another equitant formulation of Equation (4) is given by the following unconstraint problem with scalar parameter ξ: The problem in Equations (4) and (5) are convex and, thus, can be solved efficiently with l 1 -minimization techniques, such as interior point methods, proximal point methods and augmented Lagrangian methods [60]. After the sparse weight vector w is obtained, the class label for the test sample f k is selected by: where r c f k f k -D c w c 2 is the approximation error of the c-th class, l(f k ) is the label of the pixel k.

Polarimetric-Contextual Classification of PolSAR Images
Previous works have shown that including contextual information in the classification process helps to improve the classification performance [38][39][40][41][42][43]. For sparse representation with contextual information for remotely sensed image classification, the reader is referred to [44,45]. Although the methods in [44] can take advantage of contextual information, the considered neighborhood of a pixel is a rigid square, which may contain pixels from different classes. The method in [45] tackles this problem by assigning non-local weights to neighboring pixels. Nevertheless, this method is computational expensive, as additional non-local weights need to be computed. In this paper, we investigate alternative ways to exploit contextual information with the framework of SRC for PolSAR image classification. We combine SRC with the concept of superpixels. Based on superpixels, two strategies are considered to perform polarimetric-contextual classification of PolSAR images. Next, the method to generate superpixels in PolSAR images is first presented. Then, the methods for classification are described.

Superpixel Generation in PolSAR Images
The concept of superpixels is introduced in [46] and refers to small homogenous regions in images. Superpixels are often obtained by some over-segmentation methods. When used for classification, pixels within one superpixel are assumed to belong to the same class. In the remote sensing community, a similar concept is the so-called object-oriented analysis [50], in which over-segmented small regions (objects) are taken as analysis elements for classification. Due to the possibility of using superpixels to suppress the influence of speckle noise and clutters, a number of works have investigated superpixel-based classification of SAR/PolSAR images [51][52][53].
To use superpixels for PolSAR image classification, they should be generated first. Several superpixel algorithms can be found in the literature, but none is proposed for PolSAR images. The watershed algorithm [61] and normalized cut algorithm [62] have been used for superpixel generation for PolSAR images. However, the watershed algorithm produces highly irregular superpixels, and a normalized cut algorithm is computational extensive [54]. In this paper, we introduced a modified version of the SLIC algorithm [54]. Although being simple, SLIC is proven to be very effective and efficient for superpixel generation.
The basic idea of SLIC is iteratively assigning pixels to the nearest superpixels. At the beginning of the algorithm, superpixels are initialized by placing a set of seeds on the image domain. Then, the algorithm is implemented with two alternating steps: (1) fix superpixel centers and assign each pixel to the nearest superpixel according to a distance measure; (2) update superpixel centers. It should be noted that for each superpixel, only pixels in the neighborhood of the center are allowed to be assign to it. The size of the neighborhood is predefined and decides the maximum size of one superpixel.
The key issue of SLIC is the definition of the distance measure between pixels and superpixel centers. In [54], for a pixel i and a superpixel center j, the distance measure is defined as: In Equation (7), the distance between a pixel and a superpixel center is computed by combining two distances. The term d c (i, j)=|| c i − c j || 2 is the Euclidean distance between i and j in the color space, where c i , c j are color vectors of i and j respectively. The term d s (i, j)=|| x i -x j || 2 is the spatial distance between i and j in the image domain, where x i , x j are spatial location vectors. S is the maximal size of a superpixel and η is the weight to tune the contributions of color similarity and spatial proximity.
To extend the SLIC algorithm to deal with PolSAR data, the distance measure should be modified according to the property of PolSAR data. We keep the definition of spatial distance unchanged and replace the feature-based distance with a statistical model-based measure. Statistical model-based distance measures have been proven to be more suitable than Euclidean distance for SAR/PolSAR data. In this paper, a Wishart distribution-based distance d s (i, j) is used [10]: where T i is the coherency matrix of pixel i and Σ j is the mean coherency matrix of the superpixel centered at j. Substituting d p (i, j) into Equation (7) to replace the color-based distance d c (i , j) gives the distance measure for superpixel generation for PolSAR images. After superpixels have been generated, we have a collection of superpixels {sp j , j=1,…,N sp }, where N sp is the number of all superpixels. For each superpixel, pixels within it are assumed to belong to the same class. The next step is to incorporate contextual information derived from the superpixel map into sparse representation-based classification. In this paper, two approaches have been considered to combine polarimetric and contextual information together.

Combining SRC with Superpixels by Majority Voting
The first approach that has been considered is to regularize the pixel-wise classification map obtained with SRC by superpixel-based majority voting. Majority voting is a long-standing and popular method for combing the results of a set of classifiers. It has been considered to improve the classification result obtained by SVM for hyperspectral images [63]. Note that the majority voting with rigid neighborhoods can also impose spatial regularization on pixel-wise classification maps. However, using an adaptive neighborhood, such as superpixel (or segments in [63]), helps to preserve class boundaries and suppress over-smoothness. In our case, each superpixel in the PolSAR image is taken as a unit, and pixels in it are all supposed to have the same class label. However, the pixel-wise SRC classifies each pixel independently. This can be considered as classifying a superpixel with different descriptors (i.e., feature vectors associated with pixels in the superpixel). Therefore, the majority voting is actually a decision fusion process, in which the classification results with different descriptors are combined together.
The principle of superpixel-based majority voting is shown in Figure 1. For a given superpixel, we count the times that each class label presents in that superpixel. The class label that presents most often is selected and assigned to all pixels within that superpixel. The decision rule of majority voting is formally defined as: where δ(z) is the Dirac function, sp j denotes the superpixel index by j and l(sp j ), l(i) denote the label of superpixel sp j and pixel i, respectively.

Figure 1.
The principle of combing sparse representation-based classifiers (SRCs) with superpixels with majority voting: pixels in a given superpixels are forced to have the same class label. For the classification map, different grayscales represent different classes.

Polarimetric-Contextual Classification with Superpixel-Based JSRC
Another strategy to combine polarimetric and contextual information is to classify each superpixel directly according to the features within it. One way is to compute a single descriptor for each superpixel, e.g., the mean feature vector, and then classify it with the produced single descriptor. However, this may cause information loss. Nevertheless, in the framework of sparse representation-based classification, this problem can be addressed with a joint sparsity model [64,65]. The underlying assumption of the joint sparsity model is that if a set of test samples are from the same class, they can be represented by similar dictionary atoms (i.e., the associated sparse representation weight vectors share the similar sparsity pattern). It is shown that by making use of the correlation between weight vectors, a more accurate sparse model can be derived. In our case, since pixels within one superpixel are considered to belong to the same class, the corresponding feature vectors would share a similar sparsity pattern. Therefore, we can solve the sparse representation weight vectors for all pixels in a superpixel simultaneously with a constraint that forces those weight vectors to have similar non-zero elements.
Consider a specific superpixel sp j in the PolSAR image, which contains N j pixels associated with the same number of polarimetric feature vectors. Those feature vectors are arranged in a matrix as: 1 2 , , , In the joint sparsity model, the weight matrix W j should satisfy two requirements: (1) each column is sparse; (2) the non-zero elements in all columns should be located at similar positions. It turns out that such constraints can be forced by minimizing the row-l 0 norm ,0 || || j row W , which counts the non-zero rows of W j [66]. Therefore, the optimization problem associated with the joint sparse model is: Similar to the derivation of Equation (3), the model in Equation (12) can also be made more precise by introducing an error bound: Where || ⋅ || F denotes the Frobenius norm and ε is the bound of the approximation error. Minimizing Problem (13) is difficult, just like the situation for the sparse representation of a single signal. This problem is often addressed by relaxing the problem by replacing the row-l 0 norm with a more tractable norm [66]. In this paper, we use the l 1 − l 2 norm, which is defined as: where , j n k w denotes the elements of W j at row n and column k.
Replacing the row-l 0 norm in Equation (14) with the l 1 − l 2 norm yields the following optimization problem: The class label for pixels in the superpixel sp j is given by: It should be noted that although the proposed approach and the approaches in [44,45] are all based on the JSRC, they are quite different. In the methods of [44,45], the classification is performed pixel-wise. At each pixel, the associated feature vector and feature vectors of neighboring pixels are sparsely represented simultaneously to classify that pixel. Therefore, the number of processing elements is the same as the number of pixels. On the contrary, in the proposed approach, we have obtained the superpixel map, and the classification is performed at the superpixel level. As a result, the number of processing elements is the same as the number of superpixels, which is much less than the number of pixels. This will help to reduce the computational burden. Moreover, the neighborhoods of the method in [44] are rigid squares; thus, they may include pixels of different classes. Although the method in [45] addresses this problem by using non-local weights, the computational complexity is further increased. In contrast, the neighborhoods are adaptively selected by superpixels in the proposed approach, thus avoiding the problem caused by rigid neighborhoods.

Results and Discussion
In this section, we evaluate the effectiveness of the proposed sparse representation-and superpixel-based PolSAR image classification approaches. The main objective of the experimental validation is two-fold. Firstly, the ability of sparse representation-based classifiers to produce favorable PolSAR image classification is verified. To achieve this, the proposed approach is compared with the widely-used Wishart model-based classifiers. Since the proposed two approaches exploit contextual information, two Wishart model-based competitors, which also make use of contextual information, are considered. One is a region-based Wishart maximum-likelihood (Wishart-ML) classifier [10], which takes superpixels as processing elements [53]. The other one is the Wishart-MRF classifier [14,15], which exploits contextual information by using the MRF model. We adopt the well-known graph cut algorithm [67] for energy optimization for the sake of efficiency. In addition, the SVM-based on composite kernel (SVMCK) classifier [68] is also tested. The SVM-based classifiers are shown to be powerful tools in the remote sensing field and have been successfully applied on PolSAR images [27,29]. Using composite kernels enables us to incorporate contextual information into the SVM classifier. The second objective of the experimental validation is to demonstrate the advantage of combining sparse representation-based classifiers with superpixels. This is achieved by comparing the proposed two approaches with other sparse representation-based classifiers. The results obtained by the pixel-wise SRC approach are presented to show the gain on classification accuracy by making use of contextual information. Moreover, two joint sparse representation-based classifiers are also evaluated, which are the joint sparse representation-based approach that considered square neighborhoods (JSRC-SQ) [44] and the improved approach with non-local weights (JSRC-NLW) [45]. The approaches proposed in this paper are denoted as SRC-MV and JSRC-SP, respectively. It should be noted that all approaches make use of the same polarimetric features, except the two Wishart model-based classifiers.
In practice, we can solve Equation (3) or Equation (5) to implement SRC, as well as Equation (13) or Equation (16) to implement JSRC. Nevertheless, in this paper, we do not intend to evaluate and compare the advantages and disadvantages of using different sparsity-promoting norms. Therefore, in all experiments, Problem (3) is adopted for SRC and is solved with the OMP algorithm. For JSRC, Problem (16) is adopted and is solved by the simultaneous OMP (SOMP) algorithm. The approximation error bound is set as 0.001 for SRC and 0.01 for JSRC. As can be observed in the following experiments, although being suboptimal, this setting provides very good classification results. Following the work of [68], the spatial kernel in the SVMCK approach is constructed by using the mean feature vector of a 5 × 5 local window centered at each pixel. Besides, the SVM parameters are decided by five-fold cross-validation. The weight for kernel summation is varied in the range [0, 1] with steps of 0.1. The penalizing factor in the SVM is tuned in the range of . The neighborhood size of JSRC-SQ and JSRC-NLW is chosen between 3 × 3 and 13 × 13 according to [44,45]. For the proposed two approaches and the Wishart-ML approach, the sizes of superpixels are varied. Nevertheless, we initialize the superpixel seeds as regular patches with a fixed size, so that the mean superpixel size can be roughly controlled (as the number of superpixels is decided by the initial patch size). The weight parameter η in Equation (7) is set as two, which empirically keeps a reasonable balance between the compactness and feature coherence of superpixels. All of the experiments were conducted using MATLAB R2010b on a 3.40-GHz machine with 4.0 GB RAM. We evaluate different algorithms on two different real PolSAR datasets. In our experiments, the training samples are randomly selected from the available reference data and the remaining samples are used for validation purposes. This strategy is widely admitted in the remote sensing community [39,40]. In the case that a superpixel contains training samples, the selected pixels are excluded from the superpixel in the classification process. The same disposition is used when the neighborhood of a pixel contains training samples for JSRC-SQ and JSRC-NLW. The overall accuracy (OA) and kappa coefficient are used to evaluate the accuracy performance of the classification. The efficiency of different algorithms is assessed by the CPU time cost. The performance indexes are obtained by averaging the values obtained after ten Monte Carlo runs. Following, we report the experimental results on two real PolSAR images.

RadarSat-2 Flevoland Dataset
The first dataset is a C-band fully-polarimetric SAR image collected by the RadarSat-2 system at the fine quad mode over the area of Flevoland, Netherland. The used subset consists of 700 × 780 pixels. Figure 2a is a false color image obtained by Pauli decomposition, and Figure 2b is the manually-labeled reference map. A total of four classes are identified, which are the building area, woodland, farmland and water area. Pixels with no reference label are shown in gray. The first experiment with the RadarSat-2 dataset is to illustrate the feasibility and advantage of using superpixels to incorporate contextual information into the sparse representation-based classification framework for PolSAR images. We constructed the training dictionary by randomly selecting 1% of the available labeled pixels as atoms. In our experiment, we noticed that the size of the initial patches for superpixel generation could affect the classification accuracy. Therefore, we conducted an additional experiment to analyze the effect of initial patch size on the classification accuracy. Figure 3 shows the change of the OA when the initial patch size increases from 3 × 3 to 13 × 13. It can be noticed that the best OA occurs when the initial patch size is 9 × 9 for both approaches. For the proposed two approaches, the highest OAs reach 95.72% (SRC-MV) and 92.89% (JSRC-SP), respectively, which are both much higher that the OA of 86.02% achieved by SRC. This demonstrates the effectiveness of using superpixels for incorporating contextual information for sparse representation-based PolSAR image classification. Besides, we can also conclude that the size of superpixels could have an impact on the classification accuracy. To further validate the performance of the proposed two approaches, we report results obtained by other competitors. Figure 4 shows the classification obtained by different approaches by using 1% of labeled pixels as training samples. In Figure 4a, the superpixels generated with 9 × 9 initial patches are illustrated. This superpixel map is then used in the proposed SRC-MV approach and JSRC-SP approach, as well as the Wishart-ML approach. The optimal neighborhood size for the JSRC-SQ approach and JSRC-NLW approach are 7 × 7 and 11 × 11, respectively. Several observations can be made from Figure 4. Firstly, compared with other approaches, SRC produces the most noisy classification result, notably in the woodland and building areas, which have strong textures. This proves the importance of taking advantage of contextual information for the classification of PolSAR images. Secondly, the two Wishart model-based approaches have relatively poor performance. When the resolution of the PolSAR image increases and the texture is present in the image, the scene becomes heterogeneous and the applicability of the Wishart model reduces. As a result, classification with the Wishart model may have limited accuracy. Finally, it should be noticed that for superpixel-based approaches, a superpixel is considered as a whole for classification. Therefore, if a wrong classification decision is made, then all pixels in a superpixel may be forced to have the wrong class label. This phenomenon can be observed from Figure 4c,d,h.
In Table 1, we report the quantitative accuracy indexes for different approaches on the RadarSat-2 dataset. From Table 1, we can notice the poor performance of Wishart model-based approaches. Even though the contextual information has been exploited, the classification accuracy is still less than the pixel-wise SRC approach. This clearly demonstrates the advantage of collecting varies polarimetric signatures to unfold the discriminative information contained in PolSAR data. However, the pixel-wise SRC approach still has relatively low classification accuracy compared with other polarimetric-contextual approaches. Among those approaches, sparse representation-based approaches provide favorable performance. The proposed JSRC-SP approach produces accuracy indexes that are very close to those obtained by JSRC-SQ and that are a bit lower than those obtained by JSRC-NLW. The advantage of the proposed JSRC-SP approach is that it can choose adaptive neighborhoods for classification and avoids the neighborhoods containing pixels from different classes. However, as the reference map is non-exhaustive, few labeled pixels are located at terrain class boundaries. Therefore, we would expect that the JSRC-SQ approach will achieve similar accuracy performance as the JSRC-SP approach. Nevertheless, the SRC-MV approach reaches the highest OA and kappa coefficient among all of the approaches. Another important aspect to assess the performance of algorithms is the efficiency. In Table 2, we report the running times for different approaches. We only present the running time for polarimetric feature vector-based approaches, since those two Wishart model-based approaches have relatively poor accuracy performance. For the remaining six approaches, the time for feature extraction is not counted, since it is the same for all approaches. Besides, for superpixel-based approaches, the time for superpixel generation is added to the overall running time. It can be seen that both the JSRC-SQ approach and the JSRC-NLW approach have a rather long time cost compared with other approaches. This is because both approaches need to solve a joint sparse representation problem at each pixel. On the other hand, the proposed two approaches take much less time to produce the final classification result. The JSRC-SP is the most efficient approach among those six approaches. The reason is that in JSRC-SQ, the classification is performed at the superpixel level, thus the number of processing units is much less than the number of pixels in the image. This helps to save a lot time. The SRC-MV approach has a relatively low efficiency compared with JSRC-SP. Nevertheless, in this study case, it produces higher accuracy performance.

EMISAR Foulum Dataset
This scene was acquired by the C-band fully-polarimetric EMISAR system in April, 1998, over the area of Foulum, Denmark. In Figure 5a, the 332 × 437-pixel false color image obtained by Pauli decomposition is illustrated. A reference map with five classes has been created [30], as shown in Figure 5b. The five classes are rye, coniferous, oat, winter wheat and water. Unlabeled pixels are shown in gray. This scene constitutes a challenging classification problem due to the highly intra-class heterogeneity and because of the unbalanced number of available labeled pixels per class. We randomly sampled 5% of the labeled pixels for each class as training samples, and the rest of the labeled pixels are taken as test samples. Similar to the processing of the first dataset, we conducted an experiment to decide the optimal initial patch size for superpixel generation. We find that the best accuracy performance is reached when the initial patch size is 7 × 7 (as shown in Figure 6). In Figure 7a, the generated superpixels are illustrated. The optimal neighborhood sizes for the JSRC-SQ approach and JSRC-NLW approach are 7 × 7 and 9 × 9, respectively. Figure 6b-i shows the classification result obtained by different approaches. It can be observed that while the SRC approach produces much noise, like errors, the Wishart model-based approaches cause many errors for the water class due to the strong heterogeneity of the scattering mechanism. For the other approaches, evaluating the classification performance by visual interpretation is difficult. Therefore, we computed quantitative indexes, which are reported in Table 3.   (g) (h) (i) The running time of different approaches on the EMISAR Foulum dataset is shown in Table 4. Similar to the observation on the RadarSat-2 Flevoland dataset, a significant running time reduction has been achieved by the proposed approaches compared with JSRC-SQ and JSRC-NLW. The JSCR-SP approach is the most efficient one among the compared approaches. SRC-MV costs more time than JSRC-SP, SRC, as well as SVMCK, but produces better accuracy performance.

Conclusions
In this paper, we have investigated the classification of PolSAR images with sparse representation-based classifiers and have gained several achievements. We investigated the feasibility of using sparse representation-based methods for PolSAR image classification. It is shown that by using sparse representation-based classification methods, superior classification performance can be obtained for PolSAR images when compared with traditional Wishart model-based classifiers. Moreover, two novel strategies, based on majority voting and joint sparse representation with superpixels, respectively, were proposed to incorporate contextual information for sparse representation-based PolSAR image classification. It is shown that sparse representation-based PolSAR image classification can benefit from incorporating contextual information with superpixels. When compared with the pixel-wise SRC classifier, using contextual information helps to improve the classification performance. Moreover, using superpixels not only makes the contextual information adaptive, but also helps to save on computational burden. Therefore, the proposed approaches can achieve favorable classification accuracy with reduced computational burden when compared to previous joint sparse representation-based approaches for remote sensing image classification.
Comparative experiments with real PolSAR datasets have been conducted to verify the performance of the proposed approaches. Two real PolSAR images have been used: a RadarSat-2 dataset over the region of Flevoland and an EMISAR dataset over the area of Foulum. Experimental results demonstrate that the proposed approaches provide favorable classification performance when compared against the region-based Wishart-ML classifier, Wishart-MRF classifier, SVMCK classifier, pixel-wise sparse representation-based classifier and two other joint sparse representation-based classifiers. The proposed SRC-MV approach achieves the highest classification accuracy on both tested datasets (95.72% on the RadatSat-2 dataset and 98.79% on EMISAR dataset). The other approach proposed in this paper (JSRC-SP) also produces accurate classification result on both datasets. The overall accuracy is 92.89% on the RadarSat-2 dataset and 97.44% on the EMISAR dataset, which are noticeably higher than SRC, Wishart-ML, Wishart-MRF and SVMCK and are competitive compared with other two joint sparse representation-based classifiers. Further evaluation of the running time demonstrates that the proposed two approaches have favorable efficiency. Among the compared approaches, the proposed JSRC-SP is the most efficient one (112.83 s for the RadarSat-2 dataset and 13.05 s for the EMISAR dataset). Considering its high efficiency, the JSRC-SP approach could be an interesting candidate approach when efficiency is an important factor.
In consideration of the above achievements and results, we not only enrich the family of sparse representation-based classification methods by using superpixels for incorporating contextual information, but also provide interesting alternate approaches for PolSAR image classification.
Future work will be focused on improving the performance of PolSAR image classification with sparse representation techniques. One possible direction is to construct a better dictionary with dictionary learning methods. It is expected that the classification accuracy will be further enhanced by exploiting the discriminant information in the training samples when learning the dictionary. Another line is to cope with the possible non-linear property of PolSAR features with kernel methods. Other strategies to exploit contextual information will also be studied. For example, it is possible to combine SRC with smooth-promoting models, such as the MRF model or variational methods. The key issue will be how to take advantage of the outputs of SRC in those models.

Author Contributions
Zongjie Cao, Jilan Feng and Yiming Pi contributed to the conception of this study and designed the experiments. Jilan Feng carried out the experiments and analyzed experimental results with Zongjie Cao. Jilan Feng and Zongjie Cao prepared the manuscript, and Yiming Pi also made critical revisions.