PolSAR Image Classification Based on Statistical Distribution and MRF

Classification is an important topic in synthetic aperture radar (SAR) image processing and interpretation. Because of speckle and imaging geometrical distortions, land cover mapping is always a challenging task especially in complex landscapes. In this study, we aim to find a robust and efficient method for polarimetric SAR (PolSAR) image classification. The Markov random field (MRF) has been widely used for capturing the spatial-contextual information of the image. In this paper, we firstly introduce two ways to construct the Wishart mixture model and compare their performances using real PolSAR data. Then, the better mixture model and two other classical statistically distributions are combined with MRF to construct the MRF models. In order to improve the robustness of the models, the constant false alarm rate (CFAR)-based edge penalty term and an adaptive neighborhood system are embedded into the MRF energy functional. Classification is implemented in two schemes, i.e., pixel-based and region-based classifications. Finally, agriculture fields are used as the test scenario to evaluate the robustness and applicability of these algorithms.


Introduction
Due to the all-weather and all-day high-resolution imaging capability, synthetic aperture radar (SAR) is widely favored in the field of remote sensing and has become an indispensable and important branch of remote sensing information acquisition technology. The SAR system acquires the scattering characteristics of ground targets in a specific transmitting and receiving polarization, which is a single channel system. Polarimetric SAR (PolSAR) offers multichannel information, and the fully or quad polarimetric SAR systems allow the complete backscattering characterization of scatterers [1,2]. Polarimetric information greatly improves the application capabilities of satellite SAR systems [3,4]. Classification is an important step in image analysis and interpretation, which is also a research hotspot in many applications. However, polarimetric SAR data acquired from different terrain types have different statistics, which causes difficulties in PolSAR image classification.
In 1988, Kong J A et al. [5] developed a systematic approach for PolSAR image classification. They proposed that land cover classification aims to assign pixels into different classes. Since then, many researchers continued the research, and many classification algorithms have been proposed. According to the different features used, polarimetric SAR image classification methods can be summarized into two categories: classification based on statistical properties and classification based Remote Sens. 2020, 12, 1027 3 of 23 For homogeneous regions in multi-look PolSAR images, the polarimetric covariance matrix C = {C 1 , C 2 , . . . C N } obeys the Wishart distribution [26], where Tr(·) stands for trace of a matrix, L denotes the number of looks, q is the number of channels, and Σ is the mean covariance matrix. R(L, q) = π q(q−1)/2 Γ(L) . . . Γ(L − q + 1) is the scaling function with a Gamma function Γ(·).
The complex Wishart distribution is simple and easy to calculate. In 1994, Lee J S et al. [10] used the Gamma distribution to derive the K distribution for the covariance matrix and then combined it with the Wishart distribution to derive the K-Wishart distribution. The K-Wishart distribution is better at modeling textural areas, which combines the Gamma distribution as the texture component and the Wishart distribution as the speckle component, where K ρ (·) is the second modified Bessel function with order ρ and α is the shape parameter. The K-Wishart distribution is not only suitable for describing homogeneous areas, but also applied well for heterogeneous areas, for which the probability distribution is not fitted well by the complex Wishart distribution. It has a better fitting agreement with the real data even for the extremely heterogeneous regions.

Mixture Strategies
Sometimes, the data we are trying to model are complex, in which case the mixture distribution is usually used. Mixture models can better describe heterogeneous regions in polarimetric SAR images. The Wishart distribution is widely used to model the fully developed speckle. For areas with less developed speckle, we use a mixture of Wishart distributions to fit the statistics. In this section, we consider two strategies to form the mixture models. One is to model the whole image by using a Wishart mixture model, and the other is to model a class by using a Wishart mixture model. Real data are used to evaluate the performance of mixture strategies. Then, the better mixture strategy will be combined with MRF to develop the spatial-contextual classifier.

Mixture Wishart for a Whole Image
The finite mixture model [12] was proposed to consider the polarimetric information and the spatial context fully. Compared to the traditional Wishart distribution, a mixture of Wishart models can better model the heterogeneous regions [12][13][14][15]27].
Each component of the mixture model corresponds to a specific cluster. Then, the number of components equals the number of classes, i.e., where Σ k , k = 1, 2, . . . , K is the mean covariance matrix of each cluster. π k is the weight of each component, which should satisfy K k=1 π k = 1 and π k ≥ 0. The expectation-maximization (EM) algorithm Remote Sens. 2020, 12, 1027 4 of 23 is adopted to update the variables Σ k and π k . According to the EM algorithm, the parameters can be estimated by maximizing the log-likelihood function as follows: where N is the number of pixels of the whole image.

Mixture Wishart for a Class
The method mentioned above is to model the whole image by using a mixture model, with each mixture component corresponding to a cluster. Therefore, each class is essentially described by a single component distribution. Gao W [14] proposed another way to form the mixture distribution, which uses a set of components to present the distribution density for a single class. For class k, the Wishart mixture distribution is generated by M Wishart components, where Σ k m , m = 1, 2, . . . , M denotes the centroids of each Wishart component for class k. Each component is associated with a weight π k m , which should satisfy M m=1 π k m = 1 and π k m ≥ 0. The EM algorithm is employed to determine the mixture model parameters for the mixture Wishart for a class (MWC) strategy. Classification is based on the maximum likelihood criterion.
The number of mixed components M is difficult to determine. Theoretically, the overall accuracy of classification improves when the number of mixture components increases, but the cost of computation also increases. In this study, to consider the trade-off between the computational burden and classification accuracy, the value was set to six. For each class, the centroids are firstly initialized by randomly selecting C m i from training samples, and π k m are initialized as π k m = 1/M. Then, these parameters are iteratively updated by the EM algorithm. After that, the distribution of each class can be expressed by the weighted sum of the Wishart components. The classifier labels a test sample with the class that has the largest likelihood value.

Comparison of the Two Mixture Strategies
In order to compare which model is better to present the multi-look fully polarimetric SAR data, RADARSAT-2 data acquired over the San Francisco Bay area, USA, were used to evaluate the performance of the two mixture strategies. The Pauli color-coded image is shown in Figure 1a. The PolSAR image was filtered using the refined Lee filter with a 9 × 9 window.
The classification steps for the mixture Wishart for a whole image (MWW) algorithm are as follows: 1. Initialize parameter Σ k by randomly selecting one covariance matrix from each class of the image. π k is initialized as 1/K.

2.
Compute posterior probabilities using Σ k and π k , then update the class labels based on the maximum a posteriori decision rule.

4.
Check if the classification result has converged. If not, go back to Step 2. Otherwise, the iteration ends.
MWW algorithm, it is the EM algorithm that performs the classification. MWC only uses the EM algorithm to estimate the parameters for each class, and the classification is based on the ML criterion. The classification steps are as follows: 1. Randomly select m covariance matrices from the training samples of each class as the initialization of k m Σ . π k m is initialized as 1/M. 2. Update the parameters using (4) and (5). Note that in MWC, the parameter N denotes the number of training samples of each class. 3. Construct the mixture model for each class. 4. Classify the image based on the ML criterion. The classification steps of the MWC algorithm are different from those of MWW. For the MWW algorithm, it is the EM algorithm that performs the classification. MWC only uses the EM algorithm to estimate the parameters for each class, and the classification is based on the ML criterion. The classification steps are as follows: 1.
Randomly select m covariance matrices from the training samples of each class as the initialization of Σ k m . π k m is initialized as 1/M.

2.
Update the parameters using (4) and (5). Note that in MWC, the parameter N denotes the number of training samples of each class.

3.
Construct the mixture model for each class.

4.
Classify the image based on the ML criterion.
From Figure 1b,c, we can see that compared with MWW, the classification map of MWC was better by visual inspection, especially for water and urban areas. We can see from the classification results that on the right side of the image, the MWW model was affected greatly by the strong backscatter of the offshore buildings, resulting in misclassification. According to the exiting literature, we divided urban areas into three classes, which were low density urban, high density urban, and rotated urban. The MWC model could differentiate among the three types of urban areas better than the MWW model. The classification result of the rotated urban area based on the MWC model was relatively complete. Experiment results showed that using a mixture distribution to model a single class was more precise and could improve the land cover classification accuracy.

MRF
The Markov random field theory provides a convenient and consistent way of modeling context-dependent entities such as image pixels and correlated features.
Let S denote an image on a 2D lattice, where s xy is the value for site (x, y) and m and n are the number of rows and columns of the image. We assume that X = {x i , i ∈ S} is a random field defined on S. We call X an MRF with respect to a neighborhood system η i , if and only if the following two conditions are satisfied: where S\i is the set containing all sites in S except i. The Markovianity implies site i in S is related to its neighborhood system. Hammersley and Clifford [28] proved the equivalence between Markov random fields and Gibbs distributions, which provides a mathematically tractable means of specifying the joint probability of an MRF. A Gibbs distribution takes the form: where T is the shape parameter. In practice, T is usually taken as a constant. Let T = 1. Then, is an energy function. C is a clique. ξ denotes all the cliques of η. V c (x) is a potential function. The energy function is defined as: where β>0 is a spatial smoothness parameter, and it encourages neighboring pixels to have the same region label. According to [29], we set the value of β to 1.4 in all our experiments. δ(·) is the Delta function.

Adaptive Neighborhood System
MRF takes advantage of the spatial information, so that the noise effect in the classification can be alleviated. A fixed neighborhood could result in blurred-structure details. In order to solve this problem, the adaptive neighborhood [30] is proposed to preserve the details of the image. The neighborhood system has five candidates, i.e., η = η 1 , . . . , η 5 , as shown in Figure 2. Each shape of the five Remote Sens. 2020, 12, 1027 7 of 23 candidates is related to a different terrain situation. The most suitable candidate is selected by the following criterion, η = arg min i∈{1,2,...,5} std(span(η i )), (11) where span(·) denotes the span values of pixels in η i and std(·) denotes the standard deviation. By this means, the MRF model is better to fit the real terrain backscatter.

Edge Penalty
Though the adaptive MRF can preserve the structure details of the image, it cannot identify accurate edge locations. For PolSAR images with weak edges, MRF-based methods may cause misclassification [31,32]. In order to improve the classification robustness, we introduce an edge penalty term into the MRF model.
The CFAR edge detector [33] is used to calculate the edges, which was proposed by Schou J in 2003. The edge detection is performed pixel-by-pixel. For each pixel, a set of filters that have different orientations is applied. The filters estimate the mean covariance matrices of the two sides of the filter window for the central pixel and test the equality of these two mean covariance matrices using the Wishart likelihood-ratio as follows: The parameter ρ is: After extracting the edge information, the edge penalty function ( ) [18] is constructed, as follows: where i e is the edge strength of the site i and _ edge c is a constant, which is used to balance the spatial-contextual information and the edge intensity. Adding this edge penalty term into the MRF energy function in (10), we get:

Edge Penalty
Though the adaptive MRF can preserve the structure details of the image, it cannot identify accurate edge locations. For PolSAR images with weak edges, MRF-based methods may cause misclassification [31,32]. In order to improve the classification robustness, we introduce an edge penalty term into the MRF model.
The CFAR edge detector [33] is used to calculate the edges, which was proposed by Schou J in 2003. The edge detection is performed pixel-by-pixel. For each pixel, a set of filters that have different orientations is applied. The filters estimate the mean covariance matrices of the two sides of the filter window for the central pixel and test the equality of these two mean covariance matrices using the Wishart likelihood-ratio as follows: where Z x and Z y are the mean covariance matrices on each side of the central pixel and L x and L y are the number of looks. Suppose the minimum Q of all filters is Q min . Then, −2ρ log Q min is defined as the strength of the edge. If −2ρ log Q min is larger than a threshold, an edge is detected. The parameter ρ is: After extracting the edge information, the edge penalty function g e i , e j [18] is constructed, as follows: where e i is the edge strength of the site i and edge_c is a constant, which is used to balance the spatial-contextual information and the edge intensity. Adding this edge penalty term into the MRF energy function in (10), we get: Remote Sens. 2020, 12, 1027 8 of 23

Classification Schemes
Using the above MRF energy functional, in this section, we introduce two classification schemes, which are the pixel-based classification and the region-based classification.

Pixel-Based Classification
For image classification, the mission is to estimate the class labels for pixels in an image. Set Y = y i , i ∈ S to be the observed image and X = {x i , i ∈ S} to be the class label of Y. The maximum a posteriori (MAP)-MRF framework is used for classification [34]. According to the Bayesian formula, we have: P(y i ) is independent of x i . Thus, when both the prior distribution and the likelihood function of a pattern are known, x i is determined by maximizing the posterior, where K is the number of classes and P(x i y i ) and P(x i ) are the posteriori and the a priori probabilities of the class label, respectively. p(y i x i ) is the conditional probability of the observed data, which can be obtained using the statistical models in Section 2. Combining (1) and (2) with (17), we can get the Wishart-MRF (WMRF) model and the K-Wishart-MRF (KMRF) model as follows: For the MWC model, the MRF model is embedded into each component of the mixture distribution. Then, the mixture Wishart-MRF (MWMRF) model is constructed as: The pixel-based classification is implemented via the MAP criterion.

Region-Based Classification
In [20], a region-based method was proposed for MRF to segment the image. First, an image is over segmented into a large amount of rectangular regions. Then, the WMRF model is used to adjust the boundaries. Finally, a Wishart-based ML classifier is applied to these regions to get the final classification map. However, in [20], during the boundary adjustment procedure, the posterior of each pixel was compared to the pixels of the whole image, which costs much time. Thus, we made a modification to improve this method.
Following the work in [20], the image is firstly segmented into a large amount of r × r rectangular regions, not overlapping with each other. Before building the MRF model, the adaptive MRF is firstly employed to select the best neighborhood. Since for the region-based method, a region is taken as an image unit, we only considered the WMRF and KMRF models. Suppose that Σ a = N a n=1 C a,n /N a is the average covariance matrix of region a, N a is the number of pixels in region a, and C a,n denotes the covariance matrix of the nth pixel in region a. The region-based WMRF and KMRF are defined as: Then, the MAP criterion and the iterative conditional mode (ICM) [20,35,36] algorithm are used to adjust boundaries, which is called soft segmentation [20]. Note that in this step, at each iteration, the a posteriori probability of pixel i is only computed and compared within the neighborhood η a where the pixel belongs. The neighborhood system of a region is different from that of a pixel, as shown in Figure 3. At the end of each iteration, we will check the segmentation results and assign the area that is too small to the adjacent area. The purpose of this procedure is to reduce the computational cost and avoid isolated pixels. According to (17) and (21), the region label of pixel I is estimated as: Note that Formula (22) is used to adjust the boundaries, not for classification. After the soft segmentation, each region is taken as a basic unit of the image. Then, the ML criterion is used to get the final classification result.
( ) , 1 / a N a a n a n N = =  Σ C is the average covariance matrix of region a, a N is the number of pixels in region a, and , a n C denotes the covariance matrix of the n th pixel in region a. The region-based WMRF and KMRF are defined as: Then, the MAP criterion and the iterative conditional mode (ICM) [20,35,36] algorithm are used to adjust boundaries, which is called soft segmentation [20]. Note that in this step, at each iteration, the a posteriori probability of pixel i is only computed and compared within the neighborhood a η where the pixel belongs. The neighborhood system of a region is different from that of a pixel, as shown in Figure 3. At the end of each iteration, we will check the segmentation results and assign the area that is too small to the adjacent area. The purpose of this procedure is to reduce the computational cost and avoid isolated pixels. According to (17) and (21), the region label of pixel I is estimated as: Note that Formula (22) is used to adjust the boundaries, not for classification. After the soft segmentation, each region is taken as a basic unit of the image. Then, the ML criterion is used to get the final classification result.  The procedure of the region-based classification is as follows: The procedure of the region-based classification is as follows: 1.
Divide the m × n image into I = m/r × n/r regions, which are not overlapping with each other. Let X = {x a , a = 1, . . . , I} denote the region labels.

2.
Calculate the mean covariance matrix Σ a for each region. 3.
Use (22) to update the region boundaries.

4.
Check the segmentation result. If the region area is smaller than a threshold p, reassign it to the adjacent region. The parameter p decides the smallest area of a region.

5.
Check if the segmentation result is converged. If not, go back to Step 2. Otherwise, end the iteration, and the superpixels are obtained. 6.
Calculate the mean covariance matrix Σ a of each region. Apply the ML classifier to get the final classification result.

Test Data
Polarimetric SAR data collected over two agricultural areas and a city were used for the experiments. The first were the L-band data from NASA/JPL AIRSAR collected over Flevoland Nederland in 1989. The Pauli color-coded figure and the ground truth map are shown in Figure 4a-c, which have 400 × 280 pixels. There was a total of 8 crop classes. The second dataset was from RADARSAT-2 acquired over Wallerfing, Germany. The beam pattern of this dataset had a fine beam of 8 m. The dataset was from an agriculture area, collected on May 28, 2014, with the incidence angle ranging from 40.2 • to 41.6 • on the descending pass. The size of the image was 920 × 500 pixels. This region had five classes according to the ground measure. The Pauli color-coded image and the ground truth map are shown in Figure 4d-f. The third dataset was from RADARSAT-2 acquired over Fujian Province, China. The dataset was from Fuzhou Langqi Island (center latitude: 26 • 03 N, center longitude: 119 • 35 W), collected on November 13, 2013. The resolution was around 8 m. The size of the image was 520 × 500 pixels. This region had 3 classes, including urban area, water, and forest. Because we did not have the ground truth map, we only provided the Pauli color-coded image in Figure 4g. All PolSAR images were filtered with the refined Lee filter with a 9*9 window before performing the classification. Our computer configuration was a Core i7 CPU clocked at 1.99 GHz, 1T hard drive, and 8G of memory.
The following methods were carried out in the experiments: the combination of the Wishart model and MRF (WMRF) for pixel-based classification, the combination of the K-Wishart model and MRF (KMRF) for pixel-based classification, the combination of the MWC model and MRF (MWMRF) for pixel-based classification, the combination of the MWC model and MRF without the edge penalty term (MWMRF/e) for pixel-based classification, the combination of the Wishart model and MRF (WMRF) for region-based classification, and the combination of the K-Wishart model and MRF (KMRF) for region-based classification.

Pixel-Based Classification
Classification results obtained by the pixel-based WMRF, KMRF, and MWMRF methods are shown in Figure 5a-c, Figure 6a Tables 1 and 2

Pixel-Based Classification
Classification results obtained by the pixel-based WMRF, KMRF, and MWMRF methods are shown in Figure 5a-c, Figure 6a Table 1 and Table 2, respectively, for the Flevoland and Wallerfing datasets.  According to Figures 5 and 6, we can see that the classification results by MWMRF were the best and had the highest OA and Kappa coefficients by quantitative comparison. Compared with other two models, MWMRF was better to fit the real PolSAR statistic. It was capable of capturing the spatial-contextual information, which could greatly reduce the effect of noise on the classification. As shown in Figure 5c,d and Figure 6c,d, there were more misclassified pixels around the edges in the MWMRF/e classification results as compared to those in the MWMRF classification results. In Figures 5c and 6c, it is observed by the black squares that the classification map obtained by MWMRF was smoother, and the edge locations were more accurate. This proved that the edge penalty term had a good effect on the classification. Furthermore, Tables 1 and 2 show that the classification results of MWMRF for both the Flevoland and Wallerfing datasets were better than the others for all classes. The OA and Kappa coefficients were 96.44% and 0.9580, respectively, for the Flevoland data, and 84.40% and 0.7854, respectively, for the Wallerfing data. As for the Fujian dataset, we only analyzed the results by visual inspection, since we did not have a real ground truth. According to Figure 7, the classification result by the MWMRF model was still the best, especially for urban area. The classification result of the MWMRF model had a better regional integrity and suffered less noise. Comparing Figure 7c with Figure 7d, we can see that the coastline and the edges between forest and urban were more accurate in Figure 7c. According to the above analysis, we can see that the MWMRF model was also effective for a heterogeneous region with a complex shape. The computational cost was 3 min for WMRF, 17.5 min for KMRF, and 20.8 min for MWMRF. This shows that MWMRF was a robust and efficient pixel-based classification algorithm for PolSAR images.  According to Figure 5 and Figure 6, we can see that the classification results by MWMRF were the best and had the highest OA and Kappa coefficients by quantitative comparison. Compared with other two models, MWMRF was better to fit the real PolSAR statistic. It was capable of capturing the spatial-contextual information, which could greatly reduce the effect of noise on the classification. As shown in Figure 5c-d and Figure 6c-d, there were more misclassified pixels around the edges in the MWMRF/e classification results as compared to those in the MWMRF classification results. In Figure 5c and Figure 6c, it is observed by the black squares that the classification map obtained by MWMRF was smoother, and the edge locations were more accurate. This proved that the edge penalty term had a good effect on the classification. Furthermore, Table 1 and Table 2 show that the classification results of MWMRF for both the Flevoland and Wallerfing datasets were better than the others for all classes. The OA and Kappa coefficients were 96.44% and 0.9580, respectively, for the Flevoland data, and 84.40% and 0.7854, respectively, for the Wallerfing data. As for the Fujian dataset, we only analyzed the results by visual inspection, since we did not

Region-Based Classification
The classification results obtained by region-based methods are shown in Figures 7 and 8. We set r = 10 and p = 25 in this study. The image was firstly divided into many 10 × 10 rectangles, and then, MRF models were used to adjust the boundaries. We assumed the area of a region to be no less than 25 pixels. The segmentation results are in Figure 8a,b, Figure 9a,b, and Figure 10a,b. The final classification maps were obtained based on the segmentation results using the Wishart-based ML algorithm, which are shown in Figure 8c,d, Figure 9c,d, and Figure 10c,d. Tables 3 and 4 give the OA and the Kappa coefficients for the Flevoland and Wallerfing data classification results.
We set r=10 and p=25 in this study. The image was firstly divided into many 10×10 rectangles, and then, MRF models were used to adjust the boundaries. We assumed the area of a region to be no less than 25 pixels. The segmentation results are in Figure 8a-b, Figure 9a-b, and Figure 10a-b. The final classification maps were obtained based on the segmentation results using the Wishart-based ML algorithm, which are shown in Figure 8c-d, Figure 9c-d, and Figure 10c-d. Table 3 and Table 4 give the OA and the Kappa coefficients for the Flevoland and Wallerfing data classification results.     As shown in Figure 8c,d and Figure 9c,d, the edges of the KMRF classification map were more accurate, especially for the Flevoland dataset. As for the Fujian dataset, regions were irregularly shaped, and the edges between different areas were not very clear. By visual inspection, the classification results of WMRF and KMRF did not have much difference. However, by a close comparison, it can be seen from the segmentation results in Figure 8a,b, Figure 9a,b, and Figure 10a,b that segmentation based on the WMRF model was more homogenous and the edges of the region more complete. In comparison, the segmentation results of the KMRF model were not smooth enough and were sensitive to image textures. Since the classification result of the region-based algorithm largely depended on the segmentation result, the classification maps of the WMRF model were better. The evaluations in Tables 3 and 4 also prove this. The OA and Kappa coefficients of WMRF were 93.46% and 0.9231, respectively, for the Flevoland data, and 83.62% and 0.7763, respectively, for the Wallerfing data, respectively. Although the classification accuracy and Kappa coefficients of both models were close, the classification accuracy of the WMRF model was robustly higher. Moreover, the K-Wishart distribution was more complex than the Wishart distribution. The computational cost was 18 min for WMRF and 5.3 h for KMRF. The segmentation algorithm based on KMRF consumed more time.

Discussions
We compared the pixel-based and region-based classification algorithms based on the WMRF and KMRF models. From visual inspection, it showed that the region-based algorithm was better at classifying the Flevoland data. The misclassifications caused by image textures and speckle noise were noticeably reduced. We also performed the McNemar test on the classification results of the region-based algorithm and pixel-based algorithm. The test statistics was 8.85e-81 for pixel-based and region-based WMRF and 7.55e-72 for pixel-based and region-based KMRF. The critical level was set to 0.05. It was found that for both WMRF and KMRF methods, the region-based and pixel-based classification results had different proportions of errors. As for the Wallerfing data, the results were reversed. From Tables 2 and 4, it is shown that the OA and Kappa coefficients of the pixel-based method were higher. That was because the fields in the Wallerfing dataset were small and irregularly shaped, and the edges were weak, which made it difficult for image segmentation. The region-based classification method took a region as the basic image unit. Therefore, inaccurate segmentation could result in multiple classes within a region, which would cause misclassification. The Fujian dataset had heterogeneous regions with irregular shapes. The weak edges between regions also increased the difficulty in image segmentation. However, the area of different regions was relatively large, which made it easier for image segmentation than that of the Wallerfing data. The classification maps of the region-based algorithm also showed a higher region integrity than that of pixel-based algorithms. For the time consumption, the region-based algorithms consumed more time than the pixel-based algorithms. The equipment configuration we used in the experiment was a laptop, which took more time. In actual applications, using better configuration equipment may solve the time consumption problem. Therefore, the region-based algorithm was more suitable for the Fujian dataset. In conclusion, the pixel-based method was more suitable for the Wallerfing dataset, and the region-based method was more suitable for the Flevoland and Fujian datasets.

Conclusions
This paper studied the MRF-based algorithms for PolSAR image classification. Three statistical distributions, including the Wishart, K-Wishart, and mixture Wishart models, were combined with MRF to implement the pixel-based and region-based classifications. A new mixture strategy was proposed for the MRF model, with its performance evaluated based on real PolSAR images. It was found that using a Wishart mixture distribution to model a class performed better than that to model a whole image. The adaptive neighborhood and an edge penalty term were also employed in the MRF energy functional for the better location of edges. The Wishart and K-Wishart distributions served as the likelihood term in the MRF framework, while for the mixture model, MRF was embedded into each component of the statistics to construct the MWMRF. Landcover classification is always a challenging task. In experiments, real PolSAR images acquired over an island and agricultural fields with irregular shapes were used for demonstration. It was observed from the results that the MWMRF model showed the best classification results in pixel-based methods. For region-based methods, the OA and Kappa coefficients of the WMRF model were higher than those of KMRF. Moreover, WMRF had time efficiency.
The pixel-based methods were better in preserving image details such as boundary locations. However, there existed noise (isolated class labels or small groups of misclassifications) in the classification maps. The region-based methods took region blocks as the classification unit, so the results obtained by these kinds of methods have less noise in the classified maps. However, the final classification results were largely dependent on the segmentation results. Experiments showed that the pixel-based MWMRF model was suitable for small and irregularly shaped field classification, and the region-based WMRF model could be used for classification of evenly distributed farmlands and urban areas.