MR Brain Image Segmentation: A Framework to Compare Different Clustering Techniques

: In Magnetic Resonance (MR) brain image analysis, segmentation is commonly used for detecting, measuring and analyzing the main anatomical structures of the brain and eventually identifying pathological regions. Brain image segmentation is of fundamental importance since it helps clinicians and researchers to concentrate on speciﬁc regions of the brain in order to analyze them. However, segmentation of brain images is a difﬁcult task due to high similarities and correlations of intensity among different regions of the brain image. Among various methods proposed in the literature, clustering algorithms prove to be successful tools for image segmentation. In this paper, we present a framework for image segmentation that is devoted to support the expert in identifying different brain regions for further analysis. The framework includes different clustering methods to perform segmentation of MR images. Furthermore, it enables easy comparison of different segmentation results by providing a quantitative evaluation using an entropy-based measure as well as other measures commonly used to evaluate segmentation results. To show the potential of the framework, the implemented clustering methods are compared on simulated T1-weighted MR brain images from the Internet Brain Segmentation Repository (IBSR database) provided with ground truth segmentation.


Introduction
Medical image analysis plays a crucial role in modern diagnosis.Thus, computer-based image analysis is becoming an important field, with an increasing reliance on it by the biomedical community.Diagnostic imaging is an invaluable tool in medicine today.Magnetic Resonance Imaging (MRI), Computed Tomography (CT), digital mammography and other imaging modalities provide effective means for non-invasive analysis of the anatomy of a subject.Then, computer algorithms designed for the delineation of anatomical structures and other regions of interest are key components in assisting or even automating specific medical tasks.
Image segmentation is an essential and crucial process for facilitating the delineation, characterization, and visualization of a Region Of Interest (ROI) in any medical image, because its output affects all subsequent processes of image analysis.
Medical images are difficult to segment, mainly due to the complexity of the anatomical features involved.The ROI may not be separable from its surroundings due to gray level inconsistency and the absence of strong edges along its border.The images typically contain noise, which may alter the intensity of a pixel such that its classification becomes uncertain.A single tissue class may show non-uniformity of intensity over the extent of the image.Uncertainty also occurs due to the observer variability at the expert level.Segmentation algorithms should be able to cope with these challenges.
Segmentation methods vary widely depending on the specific application, the imaging modality and other factors.Currently, there is no unique segmentation method that yields acceptable results for every medical image.
Among medical imaging modalities, MRI is one of the safest methods for producing data with high spatial resolution, and it is also a low-risk, non-invasive modality in comparison to other diagnostic imaging techniques [1].For this reason, the majority of research in medical image segmentation pertains to its use for MR images, and there are many methods available for MR image segmentation.
In particular, the segmentation of MR brain images has obtained significant focus in the field of biomedical image processing.It plays an important role in both clinical practice and neuroscience research, with applications in the field of bio-medical analysis, such as identification of tumors, classification of tissues and blood cells, multi-modal registration [2], etc.Furthermore, the ability to segment and quantify brain tissues and anatomical structures has received increasing importance in the study of brain development [3,4], and its pathologies such as neurodegeneration [5,6], and dementia [7,8], as well as in the assessment of neurological [9,10] and psychiatric disorders [11,12].
A brain image mainly consists of three regions: Gray Matter (GM), White Matter (WM) and Cerebrospinal Fluid (CSF) [13].Segmentation of a brain image aims to identify these three regions by exploiting the gray level distribution of pixels.Due to the complex structure of brain tissues in the brain images, manual segmentation is a difficult and time-consuming process.Therefore, there is a strong need to have efficient computer-based systems that can identify accurately the boundaries of brain tissues along with low interaction with the human expert.
Various automatic techniques have been proposed for segmentation of MR brain images.Their basic idea is to detect discontinuity among pixels of different regions or similarity among pixels of the same region.One category includes algorithms for detecting isolated points, lines or edges, like thresholding [14], edge-based detection [15] and region growing [16].Thresholding techniques are effective when the histograms of the ROIs and background are clearly identifiable.However, for brain images, these techniques give inaccurate segmentation results since the distribution of pixels in the brain image is very complex.Edge-based methods rely heavily on detection of boundaries in the image.However, when applied to brain images, they often result in the wrong detection of boundaries due to the complex gray level distribution of GM, WM and CSF pixels.
Another category of segmentation methods includes algorithms of region growing, region splitting and merging.Region growing techniques use the homogeneity and connectivity criteria for segmentation; hence, a region growing algorithm merges pixels based on certain criteria.Region-based segmentation algorithms typically rely on the homogeneity of the image intensities in the regions of interest, which often fail to provide accurate segmentation results due to the intensity inhomogeneity.In [17], a region-based method for image segmentation, which is able to deal with intensity inhomogeneities in the segmentation, is proposed.
Other common segmentation methods are active contour models, also called snakes, used for delineating an object outline from a possibly noisy 2D image.In [18], an edge-based active contour model using the inflation/deflation force is proposed.The method allows active contour nodes to be moved to find object boundaries in a digital image.Experiments on MRI medical images show that the method is of major practical significance if the analyzed images contain weak boundaries and/or strong noise at the same time.In [19], an active contour model is proposed to segment images with intensity inhomogeneities.This method uses Gaussian kernel filtering to regularize the level set function after each iteration.Experiments show that the method achieves similar results to the local binary fitting energy method, but in a computationally more efficient way.
Among pixel-based approaches, clustering methods are the most widely used for image segmentation [20].Clustering is the process of grouping a set of points (feature vectors) into subsets (called clusters) so that points in the same cluster are similar in some sense [21].Several types of clustering methods have been discussed in the literature like expectation-maximization [1], K-means and fuzzy clustering techniques [22,23], which allow image pixels to belong to more than one class.Among fuzzy clustering techniques, Fuzzy C-Means (FCM) is the most widely-used technique [24].It aims at minimizing an objective function according to some criteria.It permits one data point to belong to more than one cluster defined by a membership matrix.
In the last decade, clustering-based approaches received great interest in the domain of medical imaging.A huge number of papers has been proposed in the literature concerning the application of clustering methods to the segmentation of medical images such as MR breast images [25,26], microscopic blood cell images [27,28] and X-ray images [29,30].In particular, many works demonstrate the effectiveness of clustering for segmentation of MR brain images in order to detect GM, WM and GSF regions [31][32][33][34].
Despite the huge number of clustering techniques proposed for medical image segmentation, a general framework for assessing the validity of the different segmentation methods is missing.Indeed, there are many works in the remote sensing field for segmentation quality assessment, including tools for tuning segmentation parameter values [35,36].Proposed frameworks in the field of medical images enable only visual quality inspection of the segmentation results [37].Only very few studies propose frameworks for quantitative evaluation of MR image segmentation [38], but no available software tool is developed.Hence, there is a need for open-source validation tools to assess the reliability of segmentation results obtained by different clustering methods applied to MR images.
To this aim, in this work, we present a framework to aid the clinicians in the identification of the best segmentation results for their specific task.The framework includes different clustering methods and some evaluation metrics to assess the quality of the results.Specifically, for each method, the framework enables easy definition of specific running parameters and enables comparison of the segmentation results.The framework includes specific pixel-based approaches for MR image segmentation that are based on clustering and enables comparison of segmentation results obtained by different clustering methods.Comparison is made not only qualitatively, through visual inspection of segmented images, but also quantitatively by means of the evaluation method proposed in [39].This method is based on information theory, and it uses entropy as the basis for measuring the uniformity of pixel luminance within a segmentation region.The evaluation method provides a relative quality score that can be used within the framework to compare different segmentations of the same image.

The Image Segmentation Framework
Classically, image segmentation is defined as the partitioning of an image into non-overlapping regions, which are homogeneous with respect to some characteristic such as intensity or texture [40][41][42].In general, image segmentation has a two-fold goal: (a) Partitioning: divide into regions/sequences with coherent internal properties; (b) Grouping: identify sets of coherent tokens in the image.
The goal of segmentation is also to simplify and/or change the representation of an image from a low-level (rough data) into a medium-level representation (image segmented into regions) that is more understandable and suitable for further analysis.An image is a collection of measurements in two-dimensional (2D) or three-dimensional (3D) space.In medical images, these measurements or image intensities can be radiation absorption in X-ray imaging, acoustic pressure in ultrasound or RF signal amplitude in MRI.
Given an image I, the segmentation problem is to determine the regions R j ⊂ I whose union is the entire image I, namely: where R i ∩ R j = ∅ for i = j, and each R j is connected.
When the constraint that regions be connected is not considered, then the process of determining the regions R j is called pixel classification, and the regions R j are called classes.Pixel classification rather than classical segmentation is often a desirable goal in medical images, particularly when disconnected regions belonging to the same tissue class need to be identified.Determination of the total number of classes K in pixel classification is a difficult problem [43] especially when prior knowledge about the anatomy depicted in the image is missing.
The framework presented in this work is intended to aid the clinician in the application of different clustering methods for image segmentation.It includes four state-of-art clustering algorithms, namely the K-means, the fuzzy C-means, the spatial fuzzy C-means and the kernelized FCM algorithm.The framework allows the user to set the parameters of each method and to compare the obtained segmentation results using some evaluation metrics.
The core of the tool is based on the ImageJ environment (imagej.nih.gov/ij/), which is an open source software for image processing extended with new functionalities.The main steps accomplished are the following: setting the running parameters of the specific method • visualizing the segmented image • computing the evaluation metrics More in detail, the main components of the tool are the following: 1.
User work section: This component gives the user the possibility to access the primary functions of the tool, which are the clustering method section and the evaluation section.

2.
Method section: For each implemented method, the user can configure its parameters such as color space conversion, number of cluster, maximum number of iterations, stopping condition and visualization mode.

3.
Evaluation section: The segmentation results can be evaluated both qualitatively and quantitatively.Qualitative evaluation is made by visualization of the segmented image compared to the ground truth image.Quantitative evaluation is made by computing the metrics described in Section 4.
The tool offers different visualization modes for displaying the segmented image, namely the regions can be labeled in different ways: • by the cluster centroid color: Each point of a cluster is labeled with the color of its centroid (in the case of color conversion, the color space is converted back to RGB); • by a gray level: Each pixel is labeled with the number of the cluster it belongs to, and the range is stretched in 0-255; • by a random RGB color: A random RGB value is generated for each cluster; • by a binary stack: The clustering is represented as a stack of binary images.Each binary image represents a cluster; each pixel shows a hard cluster membership.Thus, it is possible to extract cluster regions from the original image by performing an AND operation between a slide of the stack and the original image.• by using a fuzzy stack: A stack of gray level images is used to show the membership values of each pixel to each cluster.Each pixel represents the soft cluster membership value of that pixel in the original image according to the currently selected cluster.
Moreover, the tool enables selection of general parameters such as number of clusters, maximum number of iterations, stopping criterion and tolerance value used to stop the algorithm, initialization criterion for the centers and the membership matrix and randomized seed used to initialize a random number sequence.
In the following sections, the implemented clustering methods and the adopted evaluation metrics are briefly described.

K-means
The K-means algorithm [44] is the major example of partitional crisp clustering that assumes each data point to belong exactly to one cluster.It aims to partition N points into K partitions (clusters) in which each point belongs to the cluster having the nearest mean.Even though K-means was first proposed over 50 years ago, it is still by far the most used clustering algorithm for its simplicity of implementation and its effectiveness.When applied to image segmentation, the K-means algorithm clusters image pixels (features could be the color or the luminance of a pixel) by iteratively computing a mean intensity for each cluster and segmenting the image by associating each pixel to the cluster with the closest mean [21].
Let {x 1 , x 2 , • • • , x N } be the set of pixels and {c 1 , c 2 , • • • , c K } be the set of cluster means (centers).A partition of the image I into K clusters can be represented by mutually disjoint sets The objective of the K-means algorithm is to minimize the distance among pixels inside the same cluster and to maximize the distance between clusters.This is obtained by minimizing the following objective function: where d(•, •) is the Euclidean distance.
The main steps of the K-means algorithm for image segmentation are: 1.
Fix the number of clusters K, and initialize the cluster centers c k (k = 1...K), either randomly or based on some heuristic; 2.
Assign each pixel to the cluster that minimizes the distance between the pixel and the cluster center; 3.
Re-compute the cluster centers by averaging all of the pixels in the cluster, namely: Repeat Steps 2 and 3 until convergence is attained (i.e., the assignment of pixels to clusters does not change) One main issue of the K-means algorithm is that the clustering result depends strongly on the initialization of the cluster centers and on the number of clusters.Besides, K-means is a local optimum search technique that usually converges to a local minimum, and it does not take pixel distribution in consideration.

Fuzzy C-Means
Unlike crisp clustering methods, which force pixels to belong exclusively to one cluster, fuzzy clustering methods allow pixels to belong to multiple clusters with varying degrees of membership, thus enabling vague or fuzzy borders between different clusters.There has been considerable interest in the past few years in the use of fuzzy segmentation methods, which retain more information from the original image than crisp segmentation methods [45][46][47].The main example of the fuzzy clustering algorithm is the fuzzy version of the K-means called Fuzzy C-Means (FCM) [24].
FCM is a partition clustering method based on the minimization of the following objective function: where d(x i , c k ) is the distance between the point x i and the cluster center c k , m is the fuzziness parameter, K is the number of clusters, N is the number of data points x i and u ik ∈ [0, 1] is the membership degree of x i belonging to the cluster k, calculated as follows: for i = 1...N, k = 1...K.Using the fuzzy membership matrix U = [u ik ], a new position of the k-th centroid is calculated as: with the constraint ∑ k u ik = 1.The fuzziness parameter 1 ≤ m ≤ ∞ is a scalar weighting exponent that controls the fuzziness degree of the clustering process.The larger is its value, the fuzzier is the partition.
If this parameter has value 1.0, the FCM approaches the crisp K-means algorithm, the membership values being only zero or one.When m approaches infinity, the mass center of the dataset is the only solution of FCM.The most common choice for m is 2.0, which has been proven to be suitable also for MR brain image segmentation [48].Minimization of ( 3) is obtained by iteratively computing membership values according to Equation (4) and cluster centers as in Equation (5).The main steps of the FCM algorithm are the following: 1.
Fix the number of clusters K and initialize the cluster centers c k (k = 1...K), either randomly or based on some heuristic; 2.
Re-compute the cluster centers c k using Equation (5) 4.
Repeat Steps 2 and 3 until convergence is attained (i.e., the assignment of pixels to clusters does not change) The parameters required to run the FCM algorithm are the number of clusters K and the fuzziness parameter m.Our framework enables the selection of these parameters.
The FCM algorithm has been used widely for the segmentation of MR images [49][50][51].The FCM method, however, does not address the spatial intensity inhomogeneity artifact induced by the radio-frequency coil in MR images [52,53].To deal with the inhomogeneity problem, many algorithms have been proposed by adding correction steps before segmenting the image [54,55] or by modeling the image as the product of the original image and a smooth varying multiplier field [1,45].More recently, spatial information has been embedded into the original FCM algorithm to better segment the images [56,57].The framework presented in this paper includes one example of spatial FCM that is briefly described in the following section.

Spatial FCM
From Equation (3), it can be observed that FCM does not incorporate any spatial dependencies between pixels.This may degrade the overall segmentation result, because neighboring regions may be highly correlated, and thus, they should belong to the same cluster.When applied to image segmentation, clustering should take into account the spatial information of pixels.To this aim, several spatial variants of the FCM have been proposed.Among these, we consider the Spatial FCM (SFCM) proposed in [58] and applied in [59] to biomedical image segmentation.The SFCM uses a spatial function defined as: where NB(x i ) represents a neighbor of the pixel x i in the spatial domain.Just like the membership function, the spatial function h ij represents the membership degree of pixel x i belonging to the j-th cluster.The spatial function of a pixel for a cluster is large if the majority of its neighbors belongs to the same clusters.The spatial function modifies the membership function of a pixel according to the membership statistics of its neighbors as follows: where p and q are parameters to control the relative importance of both functions.The iterative scheme of the SFCM is the same as in FCM, but Step 2 includes three sub-steps.The first one is the same as in standard FCM, i.e., to calculate the membership values using Equation ( 4).In the second sub-step, the membership information of each pixel is mapped to the spatial domain, and the spatial function is computed from that using Equation ( 6).Then, the new membership values are computed according to (7).The SFCM iteration proceeds by updating cluster centers according to (5) as in FCM.Specific parameters that can be set for this algorithm are: • m ≥ 1: fuzziness parameter used to control the fuzziness; if m is near one, the results are similar to those obtained by K-means • p and q: parameters used to control the relative importance of membership and spatial functions • Radius r: the spatial function is evaluated on a (2r + 1) × (2r + 1) window centered on the pixel Figure 1 shows the interface of our framework that enables the user to define such parameters.

Kernelized FCM
In recent years a number of powerful kernel-based learning methods have been proposed [60] that work to construct a nonlinear version of a linear algorithm using the so-called "kernel trick" or kernel substitution.This consists of using a (implicit) nonlinear map, from the data space to the mapped feature space Φ : X → F(x → Φ(x)) so that an input data space X with low dimension is mapped into a potentially much higher dimensional feature space F in order to turn the original nonlinear problem in the input space into potentially a linear one in a rather high dimensional feature space.A kernel in the feature space can be represented as a kernel function K defined as: where •, • denotes the inner product operation.There are different commonly-used kernel functions in the literature, such as the Gaussian Radial Basis Function (GRBF) kernel, polynomial kernel and sigmoid kernel [60].
The kernel method has also been applied to clustering.In particular, in [61], a Kernelized version of fuzzy C-means (KFCM) is proposed and applied to the segmentation of MR images.It is realized by replacing the original Euclidean distance in the FCM algorithm with a kernel-induced distance.The KFCM minimizes the following objective function: Using the GRBF kernel, we have K(x, x) = 1; hence, we obtain the following simplified expression: and the objective function ( 8) can be rewritten as: Similarly to the standard FCM algorithm, the objective function J K can be iteratively minimized by using the following update formulas: (10) and: The KFCM algorithm follows the same iterative scheme as FCM.

Evaluation Metrics
The validation process is an important step to define the reliability and reproducibility of a given brain MRI segmentation method.Often, the results of segmentation methods are evaluated only visually and qualitatively.Such an approach is either subjective or tied to particular applications.Conversely, it is desirable to judge the performance of a segmentation method objectively; hence, qualitative evaluation cannot be used as a means to compare the performance of different segmentation techniques.
Typically for quantitative validation purposes, the segmented brain images are compared to the corresponding ground truth, and an evaluation metric is calculated.For instance, the number of pixels that have been correctly segmented is divided by the total number pixels in the brain.Common measures used for quantitative evaluation of segmentation methods are accuracy, sensitivity and specificity.The accuracy of the segmentation method is computed as the rate of correctly classified pixels over all pixels.Given a tissue T (GM, WM, CSF), it is defined as: where Sensitivity refers to the ability of a clustering method to accurately identify the tissue regions in the segmented image.It is defined as: The specificity reflects the ability of the clustering method to accurately identify the non-tissue regions.It is defined as: Other measures usually considered to evaluate segmentation are the Dice Similarity Coefficient (DSC) and the Jaccard Similarity (JS) value.DSC is a statistical validation metric that was proposed in [62] to evaluate the accuracy of segmentation methods.The DSC measure describes the overlap between the segmented and ground truth images using the following formula: The JS metric is defined as: where G is the ground-truth image and S is the segmented image.High values of JS indicate that the segmented regions match the ground truth regions well.The above metrics do not always express completely the quality of a segmentation method.A good segmentation evaluation should maximize the uniformity of pixels within each segmented region and minimize the uniformity across the regions.
Consequently, a natural characteristic to incorporate into a segmentation evaluation metric is a measure of the disorder within a region.Along with this idea, in [39], the concept of entropy is used to measure the disorder within regions of a segmented image.Given a segmented image I = N j=1 R j , where R j is a region of the image, we indicate by v a specific feature used to describe the pixels in region R j and define V (v) j as the set of all possible values associated with feature v in region R j and L j (m) as the number of pixels that have a value of m for feature v.The entropy for region R j is defined as: where S j = |R j | denotes the area of region R j .
Here, we consider v to be the luminance of pixels and simplify H v (R j ) to H(R j ).Hence, from an information coding theory point of view, the quantity represents the probability that a pixel in region R j has a luminance value of m.Thus, H v (R j ) is the number of bits per pixel needed to encode the luminance for region R j , given that the region R j is known.Finally, we define the expected region entropy of image I as the expected entropy across all regions where each region has weight (or probability) proportional to its area.That is, the expected region entropy of segmented image I is: where S I is the area (as measured by the number of pixels) of the entire image.When each region has very uniform luminance, then H r (I) will be small.Besides, when all pixels in a region have the same value, then the entropy for the region will be zero.Since an over-segmented image will have a very small value of H r (I), this value should be combined with another term that penalizes segmentations having a large number of regions.In order to fully encode the information in a segmented image, we should not only encode the luminance value of a pixel within a region (i.e., the region entropy), but also a representation for the segmentation itself, i.e., we should specify the region for each pixel.
In [39] the authors introduce the layout entropy as a measure to encode the region for each pixel, defined as: Using a coding theory framework, one can view p j = S j S I as the probability that a pixel in the image belongs to region j under a probabilistic assumption that each pixel is independently selected to be in region j with probability p j .Hence, H l (I) represents the number of bits for specifying a region for each pixel.
While the expected region entropy H r (I) provides an estimate of the average disorder within regions in a segmented image and it generally decreases with the number of regions, the layout entropy increases with the number of regions.Hence, the two factors H r (I) and H l (I) can be used to counteract the effects of over-segmenting or under-segmenting when evaluating the effectiveness of a given segmentation.By additively combining both the layout entropy and the expected region entropy, in [39], the entropy-based evaluation function E is introduced to measure the effectiveness of a segmentation method.It is defined as: When the image is maximally segmented (with one pixel per region) E is not minimized, since in such a case, the layout entropy becomes very large.Furthermore, when the image is under-segmented (with very few regions) E is not minimized since the expected region entropy will be high.As desired, the measure E balances these two factors.

Application and Results
In this section, we present some comparative results of MR image segmentation using the clustering methods implemented in the proposed framework.A preliminary version of the tool is available at [63].The framework is developed in Java as a plugin for ImageJ (http://imagej.nih.gov), a public-domain Java image processing program inspired by NIH image for Macintosh.The functions provided by ImageJ built-in commands can be extended by user-written code in the form of macros and plugins.Figure 2 shows an excerpt of code of our framework that summarizes in a macro the procedure for clustering and computation of the evaluation metrics.We used T1-weighted MR brain images from the Internet Brain Segmentation Repository (IBSR), which was made available by the Center for Morphometric Analysis, Massachusetts General Hospital [64].The IBSR dataset contains a three-dimensional T1-weighted MRI brain data-set obtained from 20 normal subjects and the associated manual segmentation into three classes corresponding to different tissues, namely: GM, WM and CSF.We used the manual segmentation as the ground truth to evaluate our results.The ground truth given in IBSR is the result of manual segmentation performed by trained experts.According to the IBSR documentation, the experts used a semi-automated intensity contour mapping algorithm [65] and also signal intensity histograms.Once the external border was determined by intensity contour mapping, gray-white matter borders were demarcated using signal intensity histograms.Using this technique, borders are defined as the midpoint between the peaks of the bimodal histogram for a given structure and its adjacent tissue.Other neuroanatomical structures were segmented similarly.
For each subject, we extracted three slices, hence we considered a total of 60 images for segmentation.We used 80% of images to assess the parameters of each clustering method and the remaining 20% for the final test using the best parameter setting found for each method.Figure 3 shows two examples of test images used in the experiments and their corresponding ground truth segmentation.
The gray values of the pixels in the brain image were taken as the basis for clustering.Each clustering algorithm was executed for different parameter settings.In Table 1, we show the parameter values considered for each algorithm.The number of clusters K was varied from 4 to 10.For the fuzziness parameter m, we considered the values 1.0; 1.5; 2.0.For the SFCM algorithm, we considered all combinations of the values p = 1.0; 2.0, q = 1.0; 2.0 and r = 2.0; 3.0; 4.0.For the application of the KFCM algorithm, a window size (Ws) equal to 1, 3 and 5 was considered.Furthermore, we considered different filtering options (opt) i.e., average, median and weighted.In order to assess and compare the segmentation results obtained by the considered clustering methods, we used the entropy-based evaluation measure E described in Section 4. The measure was evaluated for each clustering algorithm, by varying the number of clusters and the other parameters.Figure 4 plots the values of E averaged on all 48 images for all the considered algorithms with different combinations of parameters.It can be seen that in each case, the optimal number of clusters is K = 4.This is compatible with the task of segmenting the brain image into three tissue regions (GM, WM, CSF) plus the background region.In Table 2, we summarize the average values of the E-measure with the standard deviation obtained on the test images.To better compare the methods, we applied a z-test to test the hypothesis that a method outperforms other methods.We found that the null hypothesis can be rejected for SFCM with the smallest p-value (confidence level); hence, we can conclude that SFCM outperforms the other methods.Figure 5 plots the best values of E in the case of segmentation of the two images shown in Figure 3 using the optimal parameter setting.For all algorithms, the optimal configuration includes K = 4 and m = 2.0.For SFCM, the optimal values for the remaining parameters are p = 1.0, q = 2.0 and r = 3.0.Finally, the KFCM provides better results with window size Ws = 1.0 and opt = median.It is interesting to note that the same optimal configuration has been found for most of the images.
Besides the E measure, we considered the other evaluation measures described in Section 4 to assess and compare the results.In Table 3, we compare the performance of the considered algorithms in terms of these measures, by averaging over all the classes.For each algorithm, the optimal parameter setting is considered.We found that SFCM provides better results even in terms of these evaluation metrics.
Figures 6 and 7 show the segmented MR brain image using K-means, FCM, spatial FCM and kernelized FCM, respectively.By observing the values of the performance measures, as well as the segmented images, we can conclude that SFCM can segment the brain images better than the other methods.Moreover, it is interesting to note that, despite the segmented images looking quite similar to the original images, the value of the entropy-based measure computed on the original images is about E = 1.27, hence much higher than the values obtained on the segmented images.Finally, we tested the sensitivity of the algorithms in the presence of noise.To this aim, all test images were corrupted by 10% salt and pepper noise, and then, a simple median filter was applied to minimize the noise before applying clustering.The median filter is a non-linear filter that preserves edges while removing impulsive noise (outliers).It consists of replacing each center pixel of an m × m neighborhood window with the median of the neighborhood window pixels.In this work, a 3 × 3 window size is used.Figure 8 shows two test images corrupted by noise.The segmentation results obtained on these two noisy images are shown qualitatively in Figures 9 and 10.Quantitative results are shown in Table 4 where a comparison with segmentation results on the original images is also given.Even in presence of noise, the SFCM algorithm succeeds in achieving better clustering results.A final step in medical image segmentation is labeling.Labeling is the process of assigning a meaningful designation to each region obtained by clustering and can be performed separately from segmentation.It maps the numerical index j of region R j to an anatomical designation.In our medical application, the labels (GM, WM, CSF) can be assigned upon inspection by a physician or technician.Hence, the clustered images should be further analyzed for a final classification of all regions.

Conclusions
The segmentation of brain MR images is an important, but challenging step in medical image analysis in both clinical and research areas.Various techniques, such as threshold-based, clustering-based or hybrid methods, have been developed and optimized to perform brain image segmentation.However, not all techniques produce a high accuracy rate.In this paper, we have proposed a framework for image segmentation that includes several clustering algorithms.Using the framework, we showed that clustering algorithms are effective methods for brain image segmentation.In particular, clustering embedded with spatial information succeeds in segmenting the MR images very well.The developed framework offers both qualitative and quantitative evaluation of the segmentation results; hence, it represents a valid support to the analysis of brain images.Further work is addressed to extend the framework so as to include the possibility to combine various clustering methods in order to achieve more robust segmentation results.

Figure 1 .
Figure 1.Graphical interface for the selection of Spatial FCM (SFCM) parameters.

Figure 2 .
Figure 2.An example of macro in the developed framework.

Figure 4 .
Figure 4. Average trend of the entropy-based measure.The dotted line refers to the average value computed on the ground truth segmented images.

Figure 5 .Figure 6 .Figure 7 .
Figure 5. Entropy-based evaluation measure for the clustering algorithms on Image 202-3 and on Image 205-3 using the optimal parameter setting.

Figure 8 .
Figure 8.The 202-3 (a) and 205-3 (b) images with added noise and the pre-processed images using the median filter (c,d).
TP (True Positive) is the number of pixels that belong to tissue T in the ground truth image and are correctly classified as tissue T in the segmented image; • FN (False Negative) is the number pixels that are classified as tissue T in the ground truth image, but classified as different tissues in the segmented image; • TN (True Negatives) is the number of pixels that are classified as different tissues both in the segmented and ground truth images; • FP (False Positives) is the number of pixels incorrectly classified as tissue T in the segmented image compared to the ground truth image.

Table 4 .
Average segmentation results obtained by different clustering algorithms on the test images.