Sar Images Statistical Modeling and Classification Based on the Mixture of Alpha-stable Distributions

This paper proposes the mixture of Alpha-stable (MAS) distributions for modeling statistical property of Synthetic Aperture Radar (SAR) images in a supervised Markovian classification algorithm. Our work is motivated by the fact that natural scenes consist of various reflectors with different types that are typically concentrated within a small area, and SAR images generally exhibit sharp peaks, heavy tails, and even multimodal statistical property, especially at high resolution. Unimodal distributions do not fit such statistical property well, and thus a multimodal approach is necessary. Driven by the multimodality and impulsiveness of high resolution SAR images histogram, we utilize the mixture of Alpha-stable distributions to describe such characteristics. A pseudo-simulated annealing (PSA) estimator based on Markov chain Monte Carlo (MCMC) is present to efficiently estimate model parameters of the mixture of Alpha-stable distributions. To validate the proposed PSA estimator, we apply it to simulated data and compare its performance to that of a state-of-the-art estimator. Finally, we exploit the MAS distributions and a Markovian context for SAR images classification. The effectiveness of the proposed classifier is demonstrated by experiments on TerraSAR-X images, which verifies the validity of the MAS distributions for modeling and classification of SAR images.


Introduction
Synthetic aperture radar (SAR) works in the microwave band and has the ability to operate under all weather conditions.SAR images have been increasingly used in different areas [1][2][3].One fundamental technique for interpretation of SAR images is classification, which is of great importance for land planning, natural risk prevention, target recognition, etc.However, SAR image classification is still a tough work due to the existence of intense speckle.Different classification approaches have been considered in both supervised and unsupervised contexts, based on Markov random field [4], neural networks [5] and other methods [6].We focus on supervised classification of SAR images based on its statistical property in Markovian framework, in which a proper statistical distribution must be adopted to model SAR images data.
SAR images generally exhibit heavy-tailed and sharp-peaked statistical properties.Many distributions, e.g., Gamma ( [7]; Chapter 4), Weibull [8,9], Log-normal ( [7]; Chapter 5), [10], K [11,12], generalized Gamma [13], Fisher [14], G 0 [15], Alpha-stable and heavy-tailed Rayleigh [16,17], and recently proposed HG 0 [18], have been applied to the statistical modeling of SAR images.Gao [19] summarizes the development history and the current researching state of statistical modeling and discusses relevant issues in his work.However, as the resolution of the images increases, multimodal statistical properties become apparent due to the different types of backscatter from a single class.For instance, in an SAR image, building areas contain strong reflectors, textures, and shadows.In this case, the previously mentioned unimodal distributions fail to capture such multimodal statistical properties.Thus, a model with the ability to describe these attributes, as well as a heavy tail and sharp peak, is necessary.Alpha-stable distribution is a flexible model that can fit many other distributions into it.The appeal of Alpha-stable distribution also derives from theoretical reason, which states that stable processes arise as limiting processes of sums of independent identically distributed (i.i.d.) random variables via the generalized central limit theorem.Actually, the only possible nontrivial limit of normalized sums of i.i.d.terms is Alpha-stable.Moreover, Alpha-stable distribution has been used as a successful alternative for modeling non-Gaussian data and has also been applied to understand SAR images, e.g., for image restoration [17], object detection [20,21], image classification [22], and image fusion [23].Applications of Alpha-stable distribution in other areas such as watermark detection, network traffic and stock returns have also been reported [24][25][26].Driven by the multimodal and impulsive histogram of high resolution SAR image, as well as the theoretical appeal and successful applications of Alpha-stable family itself, we take the mixture of Alpha-stable distributions that can fit any situation for the statistical modeling of high resolution SAR images.
The main challenge in applying MAS distributions comes from the parameter estimation, as there is no analytical expression for the probability density function.Work on MAS distributions in the literature is very limited: Salas-Gonzalez and Kuruoglu [27] proposed a reversible jump Markov chain Monte Carlo (MCMC) method for MAS distributions with an unknown number of components in the Bayesian frame, but their estimations suffered from large deviations.The same authors introduced an estimator for a mixture of symmetric alpha-stable distributions with identical characteristic exponent for each component [28], and also an estimator for a mixture of skewed alpha-stable distributions with identical shape parameters for each component [29].In both cases, they considered the number of components to be unknown.Using Buckle's work [30] on the estimation of distribution parameters, Casarin [31] introduced a random auxiliary vector for MAS distributions with a fixed number of components.However, the algorithm necessary for this approach makes it difficult to draw samples for the auxiliary variable and is time consuming.
In this paper, we introduce a pseudo-simulated annealing (PSA) estimator for the MAS distributions.Experimental results on simulated data show that the PSA estimator obtains accurate parameter values for MAS distributions.We utilize the MAS distributions estimated by PSA for the statistical modeling and classification of SAR images.To verify the effectiveness of the mixture model, classification experiments are carried out using TerraSAR-X data.The supervised Bayesian maximum a posteriori (MAP) classifier is selected over MAS distributions in a Markov random field (MRF) framework.SAR images are represented by MAS distributions that are trained via the PSA estimator and appear as likelihood probabilities.In order to preserve spatial coherence, the MRF introduces prior information, which builds a restraint between pixels and their neighborhoods.The classification is then the solution to the maximization problem of the posterior probability.To solve this problem, we use the Graph Cuts method [32][33][34] to obtain a global optimization.The proposed classifier is validated using TerraSAR-X images.
The remainder of this paper is organized as follows.Section 2 introduces MAS distributions and the PSA estimator.Section 3 gives an explanation of the statistical modeling of SAR images, and Section 4 introduces the MAS-based MRF classification algorithm.Our experiments and results are presented in Section 5, and we summarize the conclusions in Section 6.

Alpha-Stable Distribution
The notion of stability was introduced by Levy [35] in his study of normalized sums of independent and identically distributed terms.Due to the lack of an analytical expression for the probability density function (PDF), Alpha-stable distributions are usually described in terms of the characteristic function as where j = √ −1, "sgn" is the sign function.The four parameters of Alpha-stable distribution are α ∈ (0, 2], β ∈ [−1, 1], γ ∈ R + and µ ∈ R, representing the characteristic exponent, skewness parameter, dispersion parameter, and location parameter, respectively.The parameter α sets the level of impulsiveness (smaller α gives a more picked PDF and a heavier tail); the parameter β controls the skewness of the PDF (symmetric if β = 0, negatively skewed if β = −1, and positively skewed if β = 1); the dispersion parameter and location parameter are similar to the variance and mean, respectively, in a Gaussian distribution.There are three special cases: the distribution reduces to a Gaussian distribution when α = 2, a Cauchy distribution when α = 1, β = 0, and a Levy distribution when α = 0.5, β = 1. Figure 1 illustrates the Alpha-stable distributions given by various parameter values.Although there is no closed-form expression for the PDF of an Alpha-stable distribution, it is possible to obtain the PDF by applying a Fourier transformation to the characteristic function: However, it is difficult to compute Equation (2) because the characteristic function is complex and the interval of integration is infinite.Therefore, Equation (2) does not admit an analytical solution except in a few special cases.In this paper, we utilize a FFT-based approach [36] to obtain the PDF of the Alpha-stable distribution.

Mixture of Alpha-Stable (MAS) Distributions
Mixture models allow us to describe and estimate complex multimodal data by considering them as sampled from different subpopulations.The density of MAS distributions is given by where ω k is the weight of the k th component.In a Bayesian scheme, we can estimate the distribution parameters via prior information and observations using the Bayesian rule where p(θ|X) denotes the posterior probability of θ given an observation is the prior probability, p(X|θ) is thte likelihood of X given θ, and p(X) is the prior probability of X.

PSA Estimator for MAS Distributions
The Bayesian approach is an efficient technique for the estimation of MAS distribution parameters.This technique has received a lot of attention recently thanks to the evolution of MCMC computational tools.We name this the PSA estimator.
When estimating its parameters, it is convenient to consider the mixture model as a missing data problem.We assume that the data vector has been randomly drawn from K subpopulations.For each variable x j , let z j be an allocation variable for the unobserved or missing variable that indicates to which component x j belongs.Thus, z j will be equal to 1 if x j belongs to the k th subpopulation, or 0 otherwise.
Each observation x j is considered to be drawn from the k th component described by The allocation variable z j can be defined as follows Parameters then can be updated based on {z j }.
The candidate parameter

and is accepted with probability A θ new k
, where In Equation ( 7), T (t) is the temperature function, which cools as the iteration proceeds.The introduction of T (t) can prevent the chain from trapping in small area, especially in case the target distribution may have multiple peaks.p(θ) is the prior function for parameter vector θ.In the case of an independent prior for each parameter, p(θ) can be expressed in terms of product p(θ) = p(α)p(β)p(γ)p(µ).We take the prior distributions of α, β, γ and µ to be uniform, uniform, inverse gamma (IG(•|•)) and normal (N (•|•)), respectively.In addition, we use a symmetric proposal satisfying q(θ new k |θ old k ) = q(θ old k |θ new k ) for simplicity.The acceptance probability A θ new k can then be simplified as where a 0 and b 0 are parameters of the inverse gamma prior for γ, and κ and ξ are parameters of the normal prior for µ.Moreover, a normal proposal distribution is utilized in our algorithm, and its parameters are adaptively updated using estimations obtained in previous iterations: δ is set to the estimation of the previous iteration, and σ is set to the standard deviation of the previous L estimations.Adaptively updating the parameters of the proposal distribution ensures that new candidates can properly explore the entire parameter space, which further ensures that estimations rapidly converge to the true values.The PSA estimator is described in Algorithm 1.
The input includes observation X = (x 1 , x 2 , • • • , x M ), the maximum iteration number Iter, the number of components K, starting temperature T (0), the value of L for updating the parameters of the proposal distribution, and the initial parameters θ 0 k , ω 0 k for each component.
Algorithm 1 PSA estimator for parameters of MAS distributions 1: Input: For each data sample x j , obtain allocation variable z j using Equation ( Update parameters of the proposal distribution q(•|•) = N (•|δ, σ): set δ to value of the previous iteration and set σ to the standard deviation of the previous L estimations if t > L, otherwise set σ to its initialization value 7: Sample new candidates [27] by drawing samples from distribution ω ∼ D, where and n k is the number of samples assigned to the k th component 10: end for 11: Output: The critical differences between PSA estimator and Salas-Gonzalez's method [27] are twofold.Firstly, PSA takes a cooling schedule, introducing T (t) = T (0) lg(t+1) over the acceptance probability.This ensures a reasonable probability of a downhill move, thus preventing the estimation from being trapped in small region of the parameter space for long periods of time.When the chain is trapped in small region, it is worthless to the estimation but will slow down the global convergence of the estimation.Secondly, in PSA estimator, parameters of the proposal distribution are adaptively updated by using the former estimations, which makes the proposal distribution increasingly closer to the distribution of the true parameter as the estimation procedure proceeds.

Simulation Result for PSA Estimator on MAS Distributions
To illustrate the effectiveness of the PSA estimator, we conduct an experiment using the MAS model shown in Equation (9).Random samples are generated using the algorithm proposed in [37].2(a-e).These show that the PSA estimator has a good convergence procedure.Figure 2(f) displays the discrete histogram of the originally simulated data, the PDF curve of the estimated MAS distributions with three components and the PDF curve of the true MAS distributions.This illustrates that the estimated MAS distributions fit the simulated data well.
To quantitatively evaluate the performance of the PSA estimator, we list in Table 1 true values, starting values, estimated values, and standard deviation (std) by PSA estimator.Results of Salas-Gonzalez's method [27] are also included for comparison.Details of the initial conditions and implementation for Salas-Gonzalez's method were reported in his work [27].  1 shows that 9 of the 15 values estimated by PSA are closer to true values than the estimates given by Salas-Gonzalez's method.The estimations of the two methods for ω 1 , ω 2 , ω 3 and γ 3 are similar and the estimations of PSA for two other parameters µ 1 and µ 3 are slightly worse than those of Salas-Gonzalez's method.In addition, the standard deviation obtained by the proposed estimator is smaller than those of Salas-Gonzalez's method.These results demonstrate the effectiveness of the proposed PSA estimator.Therefore, it can be concluded that the proposed PSA estimator is accurate for MAS distributions.

MAS-Based Statistical Modeling of SAR Images
A statistical model offers a description of SAR images' statistical properties.The quality of classification depends on how well the statistical model agrees with the statistical properties.As mentioned in Section 1, many different models for SAR images exist, but these do not provide a good fit.The high probability of high intensities, produced by a large number of strong reflectors, leads to a heavy tail with a rather slow discrepancy in the SAR image histogram.In addition, each class of SAR image, especially at high resolutions, is usually a combination of different objects.For example, building areas generally contain strong reflectors, textures, and shadows.Such images present multimodal statistical properties; therefore, we expect that a multimodal distribution will be required to describe them.
We first analyze the performance of existing statistical models using TerraSAR-X image samples representing four different classes (river, marsh, farmland, and buildings).Each class contains 600 samples of 64 pixels × 64 pixels.Figure 3   Kolmogorov-Smirnov (KS) test is applied to evaluate whether a particular model is able to describe the statistical properties of the SAR images.We calculate the KS distance (KSD) between the empirical distribution function (EDF) and the cumulative distribution function (CDF) of the reference distribution.The KSD is defined by Equation ( 10): where F (x n ) is the CDF and F (x n ) is the EDF.In addition, we also considered the acceptance probability, which is defined as the ratio of the number of samples that the KS test accepts for a distribution (at the 5% significance level) to the total number of samples in a class.The evaluation results are shown in Tables 2 and 3.
Table 2 shows that some models may be suitable for one class but not for another.Gamma distributions are not good for modeling marsh areas and building areas, because the corresponding average KSDs are particularly large.The Weibull and K distributions do not fit marsh areas well, whereas the G 0 distribution can model marsh and building, but not river or farmland.This indicates that G 0 distribution is suitable for modeling heterogeneous but not for homogeneous images.The Alpha-stable distribution is robust for various SAR image classes, as its average KSDs are not larger than 0.046, and it provides the best model for marsh areas.In addition, the maximum KSDs of the Alpha-stable distribution are the smallest of the different distributions tested.However, the average KSDs of the Alpha-stable distribution are still large.Similar results are also displayed in Table 3, which shows that Gamma distribution and Weibull distribution have a high probability of acceptance for modeling river and farmland, whereas they are always rejected for modeling the other two classes.The K distribution is always rejected when modeling marsh areas.The Alpha-stable distribution has a high probability of acceptance, but is not always accepted for modeling all four classes.These results indicate that unimodal distributions are not sufficient for modeling high-resolution SAR images due to the multimodal statistical properties inherent in such images.To intuitively illustrate the multimodal statistical property of SAR image, we consider a 512 pixels × 768 pixels image of a building area from TerraSAR-X data.Figure 4 displays the original SAR image (Figure 4(a)), the fitting results (Figure 4(b)) of different models (including Gamma, Alpha-stable and MAS), and the estimated MAS distributions with three components (Figure 4(c)).From Figure 4(a), we can observe that many bright regions exist, caused by the existence of many strong reflectors in building area.Shadows of darkness are also observed, a result of the imaging mechanism of the SAR system, and few scatter waves are received by the SAR system in this kind of area.These factors lead to the complicated multimodal statistics of SAR images.In this case, none of unimodal distributions has the ability to model it well.Therefore, a multimodal distribution is necessary for accurate modeling of SAR images.MAS distributions can capture this essential statistical property pretty well.
We further evaluate the performance of MAS distributions with different number of components for modeling SAR image samples used previously.The results are shown in Tables 4 and 5.
From Tables 4 and 5, it can be concluded that MAS distributions substantially improve the modeling performances for all four classes.As we increase the number of components, the average KSD decreases and the acceptance probability increases for each class.The results also indicate that MAS distributions with five components are sufficient, as the average KSD and acceptance probability for each class change only slightly when the number of components increase to seven.Therefore, we choose the MAS distributions with five components to model the statistics of SAR images in the proposed classification algorithm.

MAS-Based MRF Classification Algorithm
We propose the use of MAS distributions in a Bayesian classification scheme.MAS distributions are utilized to model the statistical properties of SAR images, and the spatial context is introduced by an MRF framework.A Graph Cuts-based algorithm is applied to obtain a global optimization.The restoration method proposed in [38] assumes that the conditional probability of labeling Y = y, given an observation X = x, can be modeled by a Markov random field.Using the Hammersley-Clifford theorem, this probability can be written under Gibbs field formalism as Equation (11):

MRF-Based Segmentation
where Z is a normalization coeffient, and U is the energy function given by the sum of a data term and a regularization term Equation (12): where C y is the set of cliques of the selected neighborhood, and V c is the potential of label configuration.
In this paper, we focus solely on the data term, which is evaluated by mixture of Alpha-stable distributions.The second term is simply modeled by the Potts model [39], which can be written as Equation ( 13): with s and t being in the same clique.Result of the Markovian segmentation is the label field Y = y that minimizes U (or equivalently maximizes P ).In the energy function U of random field, mixture of Alpha-stable distributions are exploited to drive the data term.

MAS-Based MRF Classification Algorithm
We propose a supervised Bayesian classification algorithm by combining MAS distributions with MRF.Samples are required for each class to estimate the MAS distribution parameters via the PSA estimator.The first term of Equation (12) implies that the conditional probabilities are then represented by the MAS distributions.The regularization term is introduced via MRF, which builds a labeling restraint between the current pixel and its neighborhood.The classification turns out to be a posterior probability maximization problem, which is equivalent to minimizing the energy.To solve the energy minimization problem, we utilize a Graph Cuts optimization, as this is fast and obtains near-global optimization.The proposed MAS-based MRF classification algorithm is described in Figure 5.

Experiment and Result Analysis
The MAS-based MRF classifier was applied to the TerraSAR-X data.We compared the results from the MAS-based MRF classifier (MAS + MRF) with three other classifiers: Alpha-stable (AS)-based MRF classifier (AS + MRF), Gamma (GAM)-based MRF classifier (GAM + MRF), and mixture Gamma (MGAM)-based MRF classifier (MGAM + MRF).To enable a quantitative analysis, the classification accuracy for each class was calculated as the ratio of the number of correctly classified pixels to the total number of pixels in the class, given as a percentage with reference to the visually interpreted map.The regularity parameter for the Graph Cuts algorithm was empirically set to λ = 8.

Wuhan Data
The TerraSAR-X data for Wuhan in Hubei, China, was acquired on 28 September 2008, with a spatial resolution of approximately 3.0 meters in range and azimuth (Stripmap mode, multilook ground range detected).The image, shown in Figure 6(a), has a size of 2,000 pixels × 1,800 pixels.Figure 6(b) is the optical image from Google Earth ( c ⃝ Google 2013).Figure 6(c) is the visually interpreted map.A total of four land covers were identified, namely river, marsh, farmland, and buildings.The white areas labeled 'None' in Figure 6(c) could not be defined.These regions were not included in the accuracy assessment step.

Foshan Dataset
The TerraSAR-X data for Foshan in Guangdong, China, was acquired on 24 May 2008, with a spatial resolution of approximately 1.25 meters in both range and azimuth (Stripmap mode, multilook ground range detected).The image, shown in Figure 7(a), has a size of 7,000 pixels × 5,000 pixels.Figure 7(b) is the optical image from Google Earth ( c ⃝ Google 2013).Figure 7(c) is the visually interpreted map.
Again, four land covers were identified, i.e., river, marsh, farmland, and buildings.The black areas labeled 'None' in Figure 7(c) could not be defined.These regions were not included in the accuracy assessment step.

Classification Result
In this experiment, samples were selected from each class in order to learn the parameters of the MAS distributions.The Potts model was applied to introduce the spatial contextual information, which behaves as the regularization term.Various classifiers were selected to compare their classification performance.The results from the MAS + MRF classifier for the Wuhan and Foshan data are displayed in Figures 6(d) and 7(d), respectively.These both show that the global ground cover acquired by the MAS + MRF classifier was well recognized.In particular, the region edges were accurate.In our experiment, we found that the AS + MRF classifier and the MGAM + MRF classifier also obtained To evaluate the performance of the various classifiers in a localized area, we studied the 480 pixels × 480 pixels image indicated by the red box in Figure 7(a).The visually interpreted map and classifications are displayed in Figure 10.The GAM + MRF classifier result is not good, as it misclassifies large areas of marsh as river.The AS + MRF classifier and MGAM + MRF classifier obtain slightly better classification results than GAM + MRF.However, all three classifiers misclassify large areas of buildings as marsh.Fortunately, the proposed MAS + MRF classifier correctly labels these areas.It can be observed that different regions are perfectly distinguished, and the edges are also accurate.These results further illustrate the efficiency of the proposed MAS + MRF classifier.

Conclusions
This paper proposed the mixture of Alpha-stable distributions, the first application in image processing, to model the multimodal and impulsive statistical properties of SAR images especially at high resolution, and introduced a PSA estimator to determine the distribution parameters.Simulation results on simulated data validated the effectiveness of the PSA estimator, and the modeling performance over TerraSAR-X samples verified the validity of MAS distributions for SAR images.Finally, a Bayesian classification algorithm that combines MAS distributions and MRF was presented for SAR images.The classification performances over TerraSAR-X images demonstrated that the proposed mixture model is a promising candidate for the modeling and classification of SAR images and can be potentially useful for SAR images analysis, interpretation, and other important applications.

9 )Figure 2 .
Figure 2. Simulation results for the PSA estimator.(a)-(e) Evolution procedure of the three components for parameters α, β, γ, µ and ω, respectively (red: component 1, green: component 2, blue: component 3); (f) Fitting results of the estimated MAS distributions to simulated data histogram and the PDF curve obtained by the true parameters.
displays eight samples from each class.

Figure 3 .
Figure 3. Eight samples from each class (from top to bottom: river, marsh, farmland and building).

Figure 4 .
Figure 4. Selected building area and fitting results.(a) Building area in SAR image; (b) Fitting results by different models; (c) Three components of the estimated MAS distributions.
be a random observation and Y = (y 1 , • • • , y j , • • • , y M ) be the expected labeling result, with y j ∈ {1, 2, • • • , K}.In the Bayesian estimation framework, the SAR image classification could be described as a MAP problem, Y = arg max Y P (Y |X).Based on the Bayesian theory, P (Y |X) ∝ P (X|Y )P (Y ), it is equivalent with Y = arg max Y P (X|Y )P (Y ).

Figure 5 .
Figure 5. Flowchart of the MAS-based MRF classification algorithm.

Table 1 .
Simulation results for the PSA estimator on MAS distributions.

Table 3 .
Acceptance probability of different statistical models.

Table 4 .
Mixture of Alpha-stable distributions for modeling various SAR images.

Table 5 .
Acceptance probability of mixture models with different number of components.