Multiscale Unsupervised Segmentation of SAR Imagery Using the Genetic Algorithm.

A valid unsupervised and multiscale segmentation of synthetic aperture radar(SAR) imagery is proposed by a combination GA-EM of the Expectation Maximization(EM) algorith with the genetic algorithm (GA). The mixture multiscale autoregressive(MMAR) model is introduced to characterize and exploit the scale-to-scale statisticalvariations and statistical variations in the same scale in SAR imagery due to radar speckle,and a segmentation method is given by combining the GA algorithm with the EMalgorithm. This algorithm is capable of selecting the number of components of the modelusing the minimum description length (MDL) criterion. Our approach benefits from theproperties of the Genetic and the EM algorithm by combination of both into a singleprocedure. The population-based stochastic search of the genetic algorithm (GA) exploresthe search space more thoroughly than the EM method. Therefore, our algorithm enablesescaping from local optimal solutions since the algorithm becomes less sensitive to itsinitialization. Some experiment results are given based on our proposed approach, andcompared to that of the EM algorithms. The experiments on the SAR images show that theGA-EM outperforms the EM method.


Introduction
In recent years, SAR imaging has been rapidly gaining prominence in applications such as remote sensing, surface surveillance and automatic target recognition. For these applications, the segmentation of various categories of clutter is quite important, and this segmentation can play a key role in the subsequent analysis for target detection, recognition and image compression. Because of the nature of the SAR instrument, SAR images contain speckle noise, complicating the segmentation of SAR images. Several different segmentation methods especially designed for SAR data have been proposed. One approach to deal with the speckle is to use a multiscale approach, which exploits the coherent nature of SAR imagery formation. In particular, we build on the idea of characterizing and exploiting the scale-to-scale statistical variations and statistical variations in the same scale in SAR imagery due to radar speckle [1][2][3]. To fully exploit this phenomenon and its complexity, we recently introduced the mixture multiscale autoregressive (MMAR) model [4], and proposed the EM algorithm and the Bootstrap stochastic annealing EM algorithm for learning the parameters of the model, respectively. However, those EM algorithms converge to a local optimum and the result is sensitive to initialization. Additionally, the EM algorithm assumes that the number of components for modeling the distributions is known. This is not the case for many applications. In this paper, we propose an algorithm for finding the optimal number of components as well as the parameters determining the components of the MMAR model. The minimum description length (MDL) criterion is used for selecting the number of components of the model. Our approach embeds the EM algorithm and the deterministic annealing approach in the framework of the genetic algorithm (GA) so that the properties of three algorithms are utilized. The population-based stochastic search of the GA explores the search space more thoroughly than the EM method. Therefore, our algorithm enables escaping from local optimal solutions since the algorithm becomes less sensitive to its initialization. Our algorithm also enables the selection of the number of classifications using the MDL principle. This paper is organized as follows. In the next section, we will describe quadtree interpretation of SAR imagery and its MMAR Modeling. In Section 3, we will propose a hybrid method based on the GA algorithm and EM algorithm for MMAR model. In Section 4, we will present the experimental results. In Section 5, we will present a short conclusion concerning our algorithm.

Quadtree Interpretation of SAR Imagery and Its MMAR Model
The starting point for our model development is a multiscale sequence L X , 1 − L X , K , 0 X of SAR images, where L X and 0 X correspond to the coarsest and finest resolution images, respectively. The resolution varies dyadically between images at successive scales. More precisely, we assume that the finest scale image 0 X has a resolution of blocks, performing log-detection (computing 20 times the log-magnitude), and correcting for zero frequency gain variations by subtracting the mean value. According, each pixel in image m X corresponds to four "child" pixels in image 1 − m X . This indicates that quadtree is natural for the mapping. Each node s on the tree is associated with one of the pixels As an example, Figure 1 illustrates a multiscale sequence of three SAR images, together with the quadtree mapping. Here the finest-scale SAR imagery is mapped to the finest level of the tree, and each coarse scale representation is mapped to successively higher levels. We use the notation ) (s X to indicate the pixel mapped to node s . The scale of node s is denoted by ) (s m . Figure 1. Sequence of three multiresolution SAR images mapped onto a quadtree.
In this paper, we focus on a specific class of multiscale models, namely mixture multiscale autoregressive models [4] of the form: where γ is defined to reference the parent of node s . k p is order of the regression, Moreover the coefficients ) (s a i and variance i σ depend only on Φ is the standard normal distribution function and ϕ is the probability density function of a standard normal distribution.
In Bayesian unsupervised segmentation using parametric estimation, the problem of segmentation is based on the model identification. The most commonly used estimator is the ML estimator, which is solved by the classical EM algorithms [4]. The details are as follows: A. Expectation Step The posterior probability for a pixel 0 ( ) X s X ∈ to belong to class k at the iteration is given by , , Step In this step, w s,k is considered artificially as the a posterior probability of X(s), so that, in the next iteration, we have where ( , ) L K Θ is likelihood function.

Hybrid method of GA and EM Algorithm
The main goal of interweaving GA with the EM algorithm is to utilize the properties of both algorithms. Similar to the method in [5], each individual in the population represents a possible solution of the MMAR model in the GA-EM algorithm. The MDL criterion is used as a fitness function for model selection. The best individual is the one that has the lowest MDL value. The evaluation of the individuals in the population is two-fold. First, R cycles of the EM algorithm are performed on each individual which results in an update of the set of parameters and consequently of the individual which encodes these parameters. In cases where the relative log likelihood drops below a threshold, we terminate the EM and, consequently, do not perform all R cycles. This might be the case for a large value of R. Second, the MDL value is determined for each updated individual to judge the model. Hence, the evaluation process of the individual provides both, a fitness value and an update of the parameters encoded by the individual.
In the following section, the framework of the GA-EM algorithm is presented: procedure GA-EM begin Due to the switching mechanism of the components among the individuals during evolution of the GA, the components weight k π cannot be encoded. Except for the best individual, these weights are assumed to be uniformly distributed.

Recombination
The crossover operator selects two parent individuals randomly from the population ( ) within the first part of the individual and exchanges the value of the genes to the right of this position between both individuals for the first part with its associated parameters in the second part.

Selection
For selection, the (M,H)-strategy [7] is used. This approach refers to both the parent population p'(t) and the offspring population p m (t) containing M and H individuals, respectively. After both populations have been evaluated, the M best individuals are selected to form the population p""(t) for the next generation.

Enforced Mutation
If more components model the data points in a similar manner, some of their parameters are forced to mutate. This similarity is measured using the correlation coefficient. If the correlation coefficient is above the threshold, one of both components is randomly selected and added to the candidate set for mutation. Once the candidate set for enforced mutation is complete, a binary value is sampled from a uniform distribution for each candidate. According to this value, either the candidate component is removed by resetting the corresponding bit in the first part of the individual.

Mutation
The mutation operator inverts the binary value of each gene in the first part of the individuals with the mutation probability p m . For the second part of the individual, a uniform distributed random number sampled within an upper and lower bound is assigned to genes that are mutated. These bounds were determined from the data set. The mutation rate for value encoding is scaled down by a factor of number of parameter for each component. The mutation for the value encoded part of the individual is restricted to the parameters values. Since our GA-EM is elitist, there are no mutations performed on the best individual.
After the number of SAR imagery regions is detected and the model parameters are estimated, SAR image segmentation is performed by classifying pixels. The Bayesian classifier is utilized for implementing classification. That is to say, to attribute at each ) (s X a class k with the following way:

Experiments
To demonstrate the segmentation performance of our proposed algorithm, we applied it to two complex SAR images of 200× 200 pixel resolution size, consisting of woodlands and cornfields [see Figure 2(a)]. From the complex images, we generated an above-mentioned quadtree representation consisting of 3 L = levels and used a second-order regression. The weight of each component i π is selected randomly. The maximum number of Gaussian components in the data is assumed to be max M =15 for the EM and the GA-EM algorithm. The parameter setting for the GA-EM is m p =0.02 for the mutation probability, c p =0.8 for the recombination probability, K =6 for the population size, R =3 for the number of EM steps within one GA iteration, and 0.95 for the component correlation threshold. The EM algorithm is executed from 2 to max M components. The selected model is the one that achieves the lowest MDL value within the set of obtained candidate models. The termination condition of both algorithms is reached when the relative log likelihood drops below 0.001. Figure 2(c) shows the results from applying GA-EM approach to two SAR images, as well as the results [see Figure 2(b)] from EM algorithm for comparison. Table 1 compares the EM and the GA-EM. We present the percentage of pixels (%) that are correctly segmented using the best model. The results we obtain show that the GA-EM slightly outperforms the EM algorithm.

Conclusions
We combine the GA algorithm with the EM algorithm (denoted as GA-EM) and apply it to the segmentation of SAR image based on the MMAR model of SAR imagery. This kind of algorithm leads to a great improvement in ML parameter estimation and is less sensitive to initialization compared to the standard EM algorithm. Experimental results show that the GA-EM algorithm gives better results than the classical EM algorithm in the quality of the segmented image.