A Novel Classiﬁcation Optimization Approach Integrating Class Adaptive MRF and Fuzzy Local Information for High Spatial Resolution Multispectral Imagery

: This paper develops a novel classiﬁcation optimization approach integrating class adaptive Markov Random Field (MRF) and fuzzy local information (CAMRF-FLI) for high spatial resolution multispectral imagery (HSRMI). Firstly, the raw classiﬁcation results, including initial fuzzy memberships and class labels of every pixel, are achieved by a pixel-wise classiﬁcation method for a given image. Secondly, the class adaptive MRF-based data energy function is developed to integrate class spatial dependency information. Thirdly, a novel spatial energy function integrating fuzzy local information is constructed. Finally, based on the total of data and spatial energies, the raw classiﬁcation map is regularized by a global minimization of the energy function using its iterated conditional modes (ICM). The effectiveness of CAMRF-FLI is performed by two data sets. The results indicate it can reﬁne the classiﬁcation map in homogeneous areas, meanwhile, reduce most of the edge blurring artifact, and improve the classiﬁcation accuracy compared with some conventional approaches.


Introduction
Land cover information extracting from high spatial resolution multispectral imagery (HSRMI), such as WorldView, QuickBird, IKONOS image, etc., is an important remote sensing application.HSRMI can provide more detailed ground cover information.At the same time, increased spatial resolution always gives rise to spectral similarity, which in turn reduces the spectral statistical separability of different classes [1].The conventional spectral pixel-wise classification methods, such as support vector machines (SVMs), maximum likelihood classifier (MLC), etc., are inadequate for HSRMI classification.While the abundant spatial information contained in the HSRMI may help for accurate classification.So a lot of spectral-spatial classification methods have been developed and proved to be successful in improving the classification accuracy [2].
Generally, two main strategies are utilized to include spatial information to the classification.One is preprocessing, in which the spatial information is exploited and combined in a feature vector of every pixel for the pixel-wise classification methods [3][4][5].Another is the classification postprocessing (CPP), in which spatial dependences are considered in the decision rule [6][7][8][9].In this paper, we focus on CPP algorithms.Several CPP methods have been investigated and confirmed effective in enhancing classification result [10].Among them, the Majority voting (MV) filter is the simplest and most widely CPP method, in which the class of central pixel is classified as the majority class in a sliding window.Unfortunately, high accuracies are obtained at the cost of oversmooth losing image details [11].Gaussian filter (GS) was proposed to improve MV, in which, the class of the central pixel is determined by weighted class fuzzy memberships based on the Gaussian distance in a local window [12].However this method neglects the centered pixel's own features.Thus, it may produce oversmoothed results.The object-based voting (OBV) method aims to refine the raw classification result by conducting the MV based on the boundary derived from the image segmentation [13].However, the effectiveness of OBV depends on performance of segmentation.MRF is a classic CPP method, in which the raw classification map is smoothed through the dependence among neighboring pixels [14,15].While the traditional MRF often leads to oversmoothed solutions for the equal weighted neighbor' influences and fixed coefficient β.To overcome this defect, some enhanced MRF were proposed and proved to be effective in preserving image details to a certain extent [16][17][18].These works had developed more complex spatial priori models to suppress the smoothness effect.And the models can effectively preserve boundaries.Unfortunately, these MRF models involving local discontinuities still exist unexpected and isolated class labels around object boundaries.Besides, the selection of the penalty coefficient β is difficult and time-consuming in these methods.From above discussion, we can find that most of CPP methods can obtain correct classification result in homogenous regions, while a lot of errors frequently occur on boundaries.A more efficient method should be developed to deal with such boundary pixels issues.
This paper presents a novel classification optimization approach integrating class adaptive Markov Random Field (MRF) and fuzzy local information (CAMRF-FLI) approach for HSRMI classification.The aim is to provide a proper trade-off between oversmooth and spatial regularization to produce homogeneous regions while preserving image details.The CAMRF-FLI is validated through two experiments.Results showed that the CAMRF-FLI was effective in improving classification accuracy.

Methodology
In this section, we first briefly introduce class adaptive MRF for integrating the class spatial dependency information.Then, a novel local similarity measure is developed to incorporate fuzzy local information.Last, the proposed approach is described in details.

Considering an image
represents a dataset in the B-dimensional vector space, n is the number of pixels, and c is the number of classes (2 ≤ c < n) in the image.
The MRF is widely used to incorporate spatial contextual information into image classification by the dependence among neighboring pixels [10].In which the prior probability serves as an operator for directly exploiting the contextual information.According to the Harmmersley-Clifford theorem and Gibbs theory [4], for pixel x i , the prior probability is described as where β is a smoothing parameter that reflects the interaction between two neighboring pixels, E k (x i ) is the energy function of pixel x i belongs to the kth class, it is defined as where N i is the neighbors of pixel x i , i / ∈ N i , often defined as a second order neighboring system (8-neighborhood connectivity) in the traditional MRF, k(x i ) and k(x j ) denotes the label of pixel x i and its neighboring pixel x j , respectively.δ(k(x i ), k(x j )) is the Kronechker's delta function.
In the MRF, β is a very important coefficient which controls the effect between two pixels, while its selection is difficult and time-consuming in practice.In the traditional MRF, equal influences are put on the central pixel x i from the neighboring pixels without considering the local image statistics characteristics and whether the neighboring pixel is an edge pixel or not.As a result, it performs well in homogeneous areas, while edge pixels are at the risk of oversmooth.Besides, β is fixed for all neighbor windows for the whole image and the local image information may be overlooked.To address above problems, we propose a new class adaptive coefficient β * .
where N R is its cardinality, u k (x i ) and u k (x j ) denotes the fuzzy memberships of pixels x i and x j belongs to the kth class, respectively.Based on the above-mentioned discussion, a novel prior probability function P * k (x i ) is proposed as here, the shape of local window (N i ) is square, but other shape windows can also be used, and the neighborhood size is not just only 8-neighborhood, other more big window size can also be used.
As seen from Equations ( 4) and (5), there are no experimental parameters.The determination of β * is based on fuzzy memberships differences between two pixels, and it is adaptive to classes and anisotropic in the neighborhood, which is consistent with the fact the image is anisotropic, thus it can preserve more details than using the fixed one.

A Novel Local Similarity Measure
In this paper, the local similarity measure comprises the spatial distance factor and the intensity distance factor.For one pixel, the spatial constraint reflects the damping extent of the neighbors by the distance from the central pixel and is represented as here x i is the central pixel in the neighborhood N i , and x j represents the neighbor of x i .d(x i , x j ) represents the Euclidean distance between pixel x i and pixel x j .Seen from Equation ( 6), we can find w d is determined by the distance between the centered pixel and its neighbors, as a result, more local context information can be exploited.But it is not enough to show the damping extent of the neighborhood by the distance alone.Here, the intensity distance is introduced to consider local grey level statistics.The intensity distance w u (k) between two pixels is defined as From equation ( 7), we can find that the class similarity between two pixels is proportional to the w u (k), a high value means that there is a high probability that the pixels x i and x j belong to the same class, and vice versa.
Therefore, the similarity measure w ij (k) between pixels x i and x j can be defined as Obviously, no experimental parameters are needed in the similarity measure, and the pixel is affected by its neighboring pixels and itself simultaneously, and the local similarity measure is also adaptive to classes and anisotropic in the neighborhood, which is useful for remaining boundary pixels and image details when performing spatial regularization.

Proposed Method
Following the MRF-based image classification algorithm, the CAMRF-FLI is proposed by incorporating the prior probability function and local similarity measure.We perform the regularization of classification map in the MAP-MRF framework.The framework is based on the interpixel class dependence assumption that neighboring pixels tend to belong to the same class.Based on the Bayesian approach, an optimization is formulated as the minimization of the global objective energy in the image by iterative minimization of local energies.Details of the MRF-based classification approach have been introduced previously [10].The local energy of function for a pixel x i is represented as where U data (x i ) is the data energy term, and U spatial (x i ) is the spatial energy term, in the CAMRF-FLI method, considering the class spatial dependency information, by incorporating the prior probability function describe in Equation ( 5), the data energy function is defined as And considering the class spatial dependency information, by introducing the local similarity measurement described in Equation (8), in the CAMRF-FLI method, the spatial energy term is defined as To minimize the energy function Equation ( 9), many algorithms have been proposed, such as Simulated Annealing (SA), Maximum a Posterior Margin (MPM), ICM, etc. [4].Considering computational simplicity, Equation ( 9) is solved by the ICM method, in addition, at each iteration in the ICM, the fuzzy memberships for each pixel are updated by The flowchart of proposed CAMRF-FLI approach is shown in Figure 1.The implementation includes the following five steps: Step 1: Initialization The initial fuzzy memberships matrix U = {u k (x i )} c×n is obtained by the pixel-wise probabilistic MLC method, set maximum iterations as maxIteration, loop counter b = 0 and window size N i Step 2: Calculate the prior probability P * k (x i ) and the local similarity measure w ij (k) P * k (x i ) and w ij (k) are calculated according to Equations ( 5) and ( 8), respectively.
Step 3: Calculate the data energy term U data (x i ) and the spatial energy term U spatial (x i ) Based on the calculated P * k (x i ) and w ij (k), U data (x i ) and U spatial (x i ) are calculated according to Equations ( 10) and (11), respectively.
Step 4: Solve objective energy function The objective energy U(x i ) = U data (x i ) + U spatial (x i ) is minimized by the ICM algorithm.Step 4: Solve objective energy function The objective energy ) ( ) ( ) is minimized by the ICM algorithm.
Step 5: Termination The iteration will stop when

Experimental Results and Discussion
In order to assess the effectiveness of the proposed algorithm, performances of CAMRF-FLI are analyzed and compared with MLC, MV, MRF, OBV, and GS methods through two experiments.In the two experiments, the parameter β for MRF is set as 1, and the iteration is set at 100 for MRF and CAMRF-FLI.For OBV method, eCognition 8.9 software is used to obtain the segmentation map, and

Experimental Results and Discussion
In order to assess the effectiveness of the proposed algorithm, performances of CAMRF-FLI are analyzed and compared with MLC, MV, MRF, OBV, and GS methods through two experiments.
In the two experiments, the parameter β for MRF is set as 1, and the iteration is set at 100 for MRF and CAMRF-FLI.For OBV method, eCognition 8.9 software is used to obtain the segmentation map, and the related parameters are scale = 10, wcolor = 0.3.The window sizes for MV, GS, and CAMRF-FLI are changed from 3 × 3 to 19 × 19.The study areas cover grass, water, building, tree, bare soil, road, and shadow.The reference image is acquired by visual interpretation based on the well rectified image, 20% of the reference data is used as the train samples through the stratified random sampling method, and the remainder of the reference data is used as the testing samples.All algorithms were developed in Matlab 2013b.The Producer's Accuracy, Overall Accuracy (OA), and Kappa coefficient (κ) (based on the confusion matrix) are used to evaluate the classification performance.Considering simplicity, MLC is applied to obtain the posterior probability and initial class label for every pixel in experiments.Each algorithm is run ten times and the average classification accuracy, and the optimal classification results are derived.

Experiment on Data Set 1
In this experiment, a portion of a 0.61-m resolution QuickBird (512 × 512 pixels) image including three multispectral bands (red, green, and blue), which was obtained on August, 2005 (Figure 2a), is used.The study area locates in an urban area in Xuzhou City, China.Figure 2b shows the reference image.Table 1 describes the training and testing samples in the experiment 1.From Figure 2a, we can find the distributions of road and building are irregular and complex, the spectral similarity between road and building is high, and the same situation for water and shadow, grass and tree, etc., it is difficult to classify the image only using spectral information.
the related parameters are scale = 10, wcolor = 0.3.The window sizes for MV, GS, and CAMRF-FLI are changed from 3 × 3 to 19 × 19.The study areas cover grass, water, building, tree, bare soil, road, and shadow.The reference image is acquired by visual interpretation based on the well rectified image, 20% of the reference data is used as the train samples through the stratified random sampling method, and the remainder of the reference data is used as the testing samples.All algorithms were developed in Matlab 2013b.The Producer's Accuracy, Overall Accuracy (OA), and Kappa coefficient (κ) (based on the confusion matrix) are used to evaluate the classification performance.Considering computational simplicity, MLC is applied to obtain the posterior probability and initial class label for every pixel in experiments.Each algorithm is run ten times and the average classification accuracy, and the optimal classification results are derived.

Experiment on Data Set 1
In this experiment, a portion of a 0.61-m resolution QuickBird (512 × 512 pixels) image including three multispectral bands (red, green, and blue), which was obtained on August, 2005 (Figure 2a), is used.The study area locates in an urban area in Xuzhou City, China.Figure 2b shows the reference image.Table 1 describes the training and testing samples in the experiment 1.From Figure 2a, we can find the distributions of road and building are irregular and complex, the spectral similarity between road and building is high, and the same situation for water and shadow, grass and tree, etc., it is difficult to classify the image only using spectral information.3a gives the OA curves obtained with different window sizes for MV, GS, and CAMRF-FLI algorithms tested in Experiment 1, respectively.They have similar OA with small window size, while CAMRF-FLI provides a significant advantage over MV and GS in terms of OA with the increase of window size.MV and GS obtain the highest OA with a 5 × 5 window, and CAMRF-FLI has the highest OA with a 15 × 15 window.Figure 2c-h shows the classification results deriving from the MLC, MV (5 × 5), MRF, OBV, GS (5 × 5), and CAMRF-FLI (15 × 15), respectively.As shown in Figure 2c, because many mixed or spectral similar pixels exist, MLC shows a map with "salt and pepper" without incorporating spatial contextual information.Figure 2d-h all give more homogenous regions than MLC, but GS and CAMRF-FLI are more effective in eliminating isolated pixels.GS removes almost of the isolated pixels and obtains satisfactory results, but oversmooth results occur on boundaries and producing large patches.In the CAMRF-FLI results, most of the isolated pixels are eliminated and image details are satisfactorily remained.Seen from Table 1, CAMRF-FLI produces the highest OA and most of the best Producer's Accuracies.The OA gains of CAMRF-FLI over MLC, MV, MRF, OBV, and GS are 10.81%, 6.99%, 5.49%, 6.44%, and 4.45%, respectively; the classification accuracies are statistically significant at the 5% level of significance.Figure 3a gives the OA curves obtained with different window sizes for MV, GS, and CAMRF-FLI algorithms tested in Experiment 1, respectively.They have similar OA with small window size, while CAMRF-FLI provides a significant advantage over MV and GS in terms of OA with the increase of window size.MV and GS obtain the highest OA with a 5 × 5 window, and CAMRF-FLI has the highest OA with a 15 × 15 window.Figure 2c-h shows the classification results deriving from the MLC, MV (5 × 5), MRF, OBV, GS (5 × 5), and CAMRF-FLI (15 × 15), respectively.As shown in Figure 2c, because many mixed or spectral similar pixels exist, MLC shows a map with "salt and pepper" without incorporating spatial contextual information.Figure 2d-h all give more homogenous regions than MLC, but GS and CAMRF-FLI are more effective in eliminating isolated pixels.GS removes almost of the isolated pixels and obtains satisfactory results, but oversmooth results occur on boundaries and producing large patches.In the CAMRF-FLI results, most of the isolated pixels are eliminated and image details are satisfactorily remained.Seen from Table 1, CAMRF-FLI produces the highest OA and most of the best Producer's Accuracies.The OA gains of CAMRF-FLI over MLC, MV, MRF, OBV, and GS are 10.81%, 6.99%, 5.49%, 6.44%, and 4.45%, respectively; the classification accuracies are statistically significant at the 5% level of significance.

Experiment on Data Set 2
In this experiment, a QuickBird image (512 × 512 pixels) including three 0.61-m resolution fused multispectral bands (red, green, and blue) is applied.It locates the suburb area of Xuzhou, China, which was obtained on August, 2005 (Figure 4a).As shown in Figure 4a, the spectral information between road and building is very similar, and the same situation for water and shadow, grass, tree, etc.The road and building staggers distribution, thus, it is difficult to classify the image with spectral information alone.Figure 4b gives the reference image.Table 2 shows the training and testing samples in the experiment 2.
As shown in Figure 3b, MV, GS, and CAMRF-FLI algorithms also have similar OA with small window size, but CAMRF-FLI shows more robust to the window size than MV and GS in terms of OA.Seen from Figure 3b, MV and GS obtain the highest OA at the 5 × 5 window, and CAMRF-FLI achieves the best OA at the 13 × 13 window.Figure 4c-h shows the classification results provided by MLC, MV (5 × 5), MRF, OBV, GS (5 × 5), and CAMRF-FLI (13 × 13) algorithms, respectively.In Figure 4c, for the spectral similarity, MLC obtains a map with "salt and pepper" noise without incorporating spatial information, road and building and bare soil, tree and grass, and water and shadow are confused in the classification result.Figure 4d-h all obtain more homogenous regions than Figure 4c for incorporating spatial information, while MV, MRF, OBV, and GS show weaker performance than CAMRF-FLI.CAMRF-FLI produces homogeneous areas while preserving more image details.Table 2 gives the quantitative results.In Table 2, MV, MRF, OBV, GS, and CAMRF-FLI achieve higher classification accuracies than MLC.Among them, CAMRF-FLI obtains the highest OA and most of the best Producer's Accuracies.In terms of OA, CAMRF-FLI achieves a value of 79.75%, with gain of 12.89%, 9.52%, 7.35%, 8.01%, and 6.09% over other algorithms, respectively.According to the results of McNemar's test, all classification accuracies are statistically significant at the 5% level of significance.

Experiment on Data Set 2
In this experiment, a QuickBird image (512 × 512 pixels) including three 0.61-m resolution fused multispectral bands (red, green, and blue) is applied.It locates the suburb area of Xuzhou, China, which was obtained on August, 2005 (Figure 4a).As shown in Figure 4a, the spectral information between road and building is very similar, and the same situation for water and shadow, grass, tree, etc.The road and building staggers distribution, thus, it is difficult to classify the image with spectral information alone.Figure 4b gives the reference image.Table 2 shows the training and testing samples in the experiment 2.
As shown in Figure 3b, MV, GS, and CAMRF-FLI algorithms also have similar OA with small window size, but CAMRF-FLI shows more robust to the window size than MV and GS in terms of OA.Seen from Figure 3b, MV and GS obtain the highest OA at the 5 × 5 window, and CAMRF-FLI achieves the best OA at the 13 × 13 window.Figure 4c-h shows the classification results provided by MLC, MV (5 × 5), MRF, OBV, GS (5 × 5), and CAMRF-FLI (13 × 13) algorithms, respectively.In Figure 4c, for the spectral similarity, MLC obtains a map with "salt and pepper" noise without incorporating spatial information, road and building and bare soil, tree and grass, and water and shadow are confused in the classification result.Figure 4d-h all obtain more homogenous regions than Figure 4c for incorporating spatial information, while MV, MRF, OBV, and GS show weaker performance than CAMRF-FLI.CAMRF-FLI produces homogeneous areas while preserving more image details.Table 2 gives the quantitative results.In Table 2, MV, MRF, OBV, GS, and CAMRF-FLI achieve higher classification accuracies than MLC.Among them, CAMRF-FLI obtains the highest OA and most of the best Producer's Accuracies.In terms of OA, CAMRF-FLI achieves a value of 79.75%, with gain of 12.89%, 9.52%, 7.35%, 8.01%, and 6.09% over other algorithms, respectively.According to the results of McNemar's test, all classification accuracies are statistically significant at the 5% level of significance.

Step 5 :
TerminationThe iteration will stop when b = b + 1 < maxIteration is met; otherwise, update the posterior probability for each pixel U = u new k (x i ) c×n according to Equations(12), and go back to Step 2 and repeat.The main contributions of CAMRF-FLI are as follows.(1) It is free of any experimentally adjusted parameters.(2) The class spatial dependency information is considered by introducing the class adaptive MRF-based prior probability function.(3) The spectral dissimilarity is taken into account by incorporating the local similarity measurement based fuzzy local information.The identification of centered pixel is determined by the spatial dependency and spectral information of its own and neighbors' pixels simultaneously, to a certain extent the CAMRF-FLI approach can refine the classification map in homogeneous areas and reduce the boundary pixels classification errors simultaneously.Appl.Sci.2018, 8, x 5 of 11 (12), and go back to Step 2 and repeat.The main contributions of CAMRF-FLI are as follows.(1) It is free of any experimentally adjusted parameters.(2) The class spatial dependency information is considered by introducing the class adaptive MRF-based prior probability function.(3) The spectral dissimilarity is taken into account by incorporating the local similarity measurement based fuzzy local information.The identification of centered pixel is determined by the spatial dependency and spectral information of its own and neighbors' pixels simultaneously, to a certain extent the CAMRF-FLI approach can refine the classification map in homogeneous areas and reduce the boundary pixels classification errors simultaneously.

Figure 3 .
Figure 3. (a,b) shows the OA obtained with different window sizes for MV, GS, and CAMRF-FLI algorithms tested in experiments 1 and 2, respectively.

Figure 3 .
Figure 3. (a,b) shows the OA obtained with different window sizes for MV, GS, and CAMRF-FLI algorithms tested in experiments 1 and 2, respectively.

Table 1 .
Number of samples, producer's accuracy, OA, and κ of the classification in experiment 1.

Table 1 .
Number of samples, producer's accuracy, OA, and κ of the classification in experiment 1.