A Fuzzy Consensus Clustering Algorithm for MRI Brain Tissue Segmentation

: Brain tissue segmentation is an important component of the clinical diagnosis of brain diseases using multi-modal magnetic resonance imaging (MR). Brain tissue segmentation has been developed by many unsupervised methods in the literature. The most commonly used unsupervised methods are K-Means, Expectation-Maximization, and Fuzzy Clustering. Fuzzy clustering methods offer considerable beneﬁts compared with the aforementioned methods as they are capable of handling brain images that are complex, largely uncertain, and imprecise. However, this approach suffers from the intrinsic noise and intensity inhomogeneity (IIH) in the data resulting from the acquisition process. To resolve these issues, we propose a fuzzy consensus clustering algorithm that deﬁnes a membership function resulting from a voting schema to cluster the pixels. In particular, we ﬁrst pre-process the MRI data and employ several segmentation techniques based on traditional fuzzy sets and intuitionistic sets. Then, we adopted a voting schema to fuse the results of the applied clustering methods. Finally, to evaluate the proposed method, we used the well-known performance measures (boundary measure, overlap measure, and volume measure) on two publicly available datasets (OASIS and IBSR18). The experimental results show the superior performance of the proposed method in comparison with the recent state of the art. The performance of the proposed method is also presented using a real-world Autism Spectrum Disorder Detection problem with better accuracy compared to other existing methods.


Introduction
Segmenting brain tissue is the process of subdividing the image of the brain into major components such as Cerebrospinal Fluid (CSF), Gray Matter (GM), and White Matter (WM). The step of brain tissue segmentation is fundamental in diagnosing and monitoring a wide range of neurological diseases. Several researchers have strived to develop automatic brain tissue segmentation in the last two decades [1][2][3][4].
Brain tissue segmentation has been developed by many unsupervised methods in the literature. The most commonly used unsupervised methods are: K-Means [5][6][7], Expectation-Maximization [8], and Fuzzy Clustering [9,10]. Fuzzy clustering methods offer considerable benefits compared with the aforementioned methods as they are capable of handling brain images that are complex, largely uncertain, and imprecise.
Even thoug, traditional Fuzzy C-Means (FCM) showcases outstanding results on brain image segmentation, it has some limitation such as being sensitive to noise due to the use of the Euclidean distance metric and neighbourhood information ignorance. The FCM computes the distance between cluster center and voxels using a Euclidean distance measure. Euclidean distance is very sensitive to noise which results in the deterioration of segmentation results. In the literature, we found many variants of FCM methods that are developed to address the aforementioned shortcomings. To address the noise sensitivity, researchers added the spatial information into the FCM objective function [11,12]. The addition of a spatial function to an objective function helps to reduce the impact of noise and also helps to enhance performance. The spatial information may be local or global [13]. On the other hand, to address the limitations of Euclidean distance, many researchers developed a kernel version of FCM and named it Kernel FCM (KFCM) [14,15]. KFCM adopts kernel function as a distance measure. The kernel function transfers the input data to higher dimensional kernel space and makes the clustering task easier. The aforementioned FCM variant methods are based on a traditional fuzzy set. In a fuzzy set, the non-membership value is always the complement of the membership value. However, in real time, this assumption fails due to hesitation. The hesitation arises due to uncertainty in defining the membership function. To handle this hesitation, Atanassov [16] developed an advanced fuzzy set called Intuitionistic Fuzzy Set (IFS). In IFS, the non-membership value is computed using the fuzzy complement generator functions. In recent times, researchers have given more attention in developing IFS-based clustering methods [17][18][19][20]. Chaira [18] developed an Intuitionistic Fuzzy C-Means (IFCM) where the intuitionistic fuzzy entropy is added to the conventional FCM objective function. The intuitionistic fuzzy set handles the uncertainty which originates while defining a membership function by considering the hesitation degree. To handle noise and uncertainty during segmentation, Verma et al. [21] considered both the pixel and local neighborhood information. The main benefit of this method is that it is non-parametric.
Recently, researchers have come to realize that a single clustering method might fail to produce good results with complex data. Hence, they are concentrating on developing consensus clustering methods [22,23]. Consensus clustering is also known as cluster ensemble, and its main aim is to find a single partition of data with overlapping clusters. In the literature, it has been widely agreed that consensus clustering can generate robust results [24][25][26][27]. Motivated by the advantages of consensus clustering, in this paper we are proposing a brain tissue segmentation method based on consensus clustering. The proposed method consists of two steps: Pre-processing and Segmentation. In pre-processing, the brain images are pre-processed by employing registration, skull stripping, and bias field correction. In the segmentation step, initially the brain images are segmented using four different clustering methods. The two clustering methods are based on a traditional fuzzy set and the other two are based on an intuitionistic set. In the traditional fuzzy set category, Robust Spatial Kernel FCM (RSKFCM) [28] and Generalized Spatial Kernel FCM (GSKFCM) [29] are employed. On the other hand, in the intuitionistic fuzzy set category, two variants of Modified Intuitionistic Fuzzy C-Means [20] are employed. Furthermore, the results of four individual clustering methods are combined using a voting schema. The proposed approach is evaluated on two publicly available MRI datasets: OASIS and IBSR18 Dataset, and the results are compared using the results with state-of-art methods. The primary contributions of this paper are as follows: • Proposed a consensus clustering method for MRI brain tissue segmentation.

•
The results of four variants of fuzzy clustering methods are combined to achieve better results. • To check the efficacy of the proposed method, we conducted experiments on two standard brain segmentation datasets.
The remainder of the paper is as follows: Section 2 presents the methodology of the proposed method. We then introduce the datasets and the evaluation metrics alongside the implementation details and discussions on the performance of the proposed method in Section 3. Finally, the conclusion of the paper is presented in Section 4.

Methodology
This section presets the methodology of the proposed method. The proposed consensus clustering method comprises two steps: Pre-procesing and Segmentation.

Pre-Processing
We perform three pre-processing steps, namely Registration, Bias Field correction, and Skull Stripping. Registration is the process of spatially aligning two or more images of the same content taken from a different view and/or at a different time, and alignment of the multi-modal image of the same patient is required. Bias Field refers to a low-frequency signal which corrupts the MRI images due to inhomogeneities in the magnetic field of the MRI machines. Bias field leads to intensity inhomogeneity, and in turn, it affects the segmentation accuracy. Hence, the bias field needs to be corrected before performing the segmentation. Skull Stripping is the process of removing non-brain tissues such as fat, skull, and neck. These non-brain tissues have an intensity that overlaps with the intensity of the other brain tissues. Thus, the brain tissues have to be extracted before the brain segmentation. There are many skull stripping methods such as Brain Extraction Tool (BET) [30], Brain Surface Extraction (BSE) [31], AFNI ("Analysis of Functional NeuroImages" (AFNI) software package publicly available at https://afni.nimh.nih.gov/ (accessed on 19 April 2022)), BridgeBurner [32], GCUT [33], and ROBEX [34]. Among all these methods, ROBEX provides significantly improved performance [34].
All the aforementioned steps are optional and depend on the image data used for the study. Hence, in this paper, different pre-processing steps are performed for different datasets. The pre-processed brain images are segmented using consensus clustering. The following subsection presents a detailed description regarding segmentation.

Segmentation
The proposed consensus clustering method consists of a combination of traditional fuzzy sets and intuitionistic sets to not only increase the robustness of the noise but also use the neighborhood information when forming the clusters. To do so, we use the Robust Spatial Kernel FCM (RSKFCM) [28] and Generalized Spatial Kernel FCM (GSKFCM) [29] methods alongside the two variants of the Modified Intuitionistic Fuzzy C-Means [20] technique. Finally, we fuse the results of the clustering methods using a voting schema. The next subsections explain the employed clustering methods and the voting schema in detail.

Robust Spatial Kernel FCM (RSKFCM)
Robust Spatial Kernel Fuzzy C-Means (RSKFCM) [28] is the variant of conventional Fuzzy C-Means (FCM). RSKFCM addresses the noise sensitivity and neighborhood information ignorance limitations of FCM. RSKFCM injects the neighborhood information into the FCM objective function and uses the Gaussian Kernel function instead of the Euclidean metric.
The main aim of the RSKFCM is to minimize the objective function shown in Equation (1) where c is the number of clusters, n is the number of voxels, m is a fuzzifier value, which controls the fuzziness of the resulting partition, w ij is the RSKFCM membership degree of x j in ith cluster, v i is the ith cluster center, and Φ is an implicit nonlinear map which is computed as: where K is the inner product of kernel function, i.e., K(x, y) = Φ(x) T Φ(y). In this paper, we have adopted the Gaussian kernel function which is defined as: In Gaussian kernel, K(x, x) = 1 and K(v, v) = 1, hence the kernel function becomes: Substituting Equation (4) in Equation (1), the objective function becomes: The RSKFCM membership function w ij is the combination of the kernel membership function u ij , and the neighbourhood function s ij and it is computed as.
where p and q are parameters to control the relative importance of the kernel membership and the neighbourhood membership functions.
The kernel and neighbourhood membership functions are computed using Equations (7) and (8) where N k (x j ) represents neighbourhood voxels of x j . This neighbourhood function represents the probability that the voxel x j belongs to the ith cluster. Similar to FCM, RSKFCM also works in an iterative process to update the membership and cluster center values. The cluster centers are updated using Equation (9) RSKFCM is an iterative process, and it stops when the stopping criteria is satisfied, i.e., the difference of successive iteration's objective function value is less than the userspecified stopping criteria value.

Generalized Spatial Kernel FCM (GSKFCM)
The generalized Spatial Kernel FCM (GSKFCM) [29] is another variant of the conventional FCM. Even though RSKFCM overcomes the limitations of the FCM, the performance is not good because it injects neighborhood information only into the objective function. However, the distance function plays a vital role in computing the membership value. Thus, the addition of neighborhood information can increase the performance. The RSKFCM also assumes all features have equal importance. However, in a real-world problem, all the features may not be equally important. GSKFCM overcomes these limitations by injecting the weighted neighbourhood information into the distance function and employing the Gaussian kernel as the distance metric.
The aim of the GSKFCM is to minimize the objective function shown in Equation (10).
where z ij is the GSKFCM membership function, and it is computed as: where d new is the GSKFCM distance function which incorporates the neighbourhood function into the distance function, and it is computed as: where, d 2 x j , v i is the Gaussian Kernel distance function shown in Equation (4), and is the neighbourhood function. The GSKFCM considers the neighbourhood information and computes the membership value associated with each voxel as the weighted sum of the traditional FCM membership value and the membership value of the N k neighbour points. The neighbourhood function (p ij ) is defined as: where N k is the number of neighbourhood voxels, g(u ik ) = u ik is the membership function (Equation (7)), h(x j , x k ) is the distance function which is computed as: Substituting Equation (14) in Equation (13), the neighbourhood function becomes: Substituting Equation (12) in Equation (11), the membership function z ij becomes, Similar to FCM and RSKFCM, GSKFCM operates as an iterative process by updating membership and cluster center value. The cluster centers are updated using Equation (19) GSKFCM decides the label based on the maximum membership value.

Modified Intuitionistic Fuzzy C-Means (MIFCM)
Modified Intuitionistic Fuzzy C-Means (MIFCM) [20] is the variant of the conventional Intuitionistic Fuzzy C-Means (IFCM) [18], and it is based on an intuitionistic fuzzy set. In MIFCM, the input data is clustered by optimizing the following objective function shown in Equation (20) where x j represents jth voxel, v i refers to ith cluster center, m refers to the fuzzification value, β ij refers to the MIFCM membership value of jth voxel to ith cluster, and d H (x j , v i ) is the modified Hausdorff distance between jth voxel to ith cluster center. Similar to Fuzzy C-Means, MIFCM optimizes the objective function iteratively by updating the membership and cluster centers. The MIFCM membership value is updated using equation where µ ij is the membership value and π ij is the hesitation value. The membership value µ ij is computed as follows: The hesitation value π ij is the combination of the membership and the non-membership value, and it is computed as: where η ij is the non-membership value. To compute the non-membership value, Sugeno's and Yager's intuitionistic fuzzy complement generators are used and the value is computed using Equations (24) and (25), respectively.
In this paper, we employed both Sugeno's and Yager's complement generators. MIFCM using Sugeno's function is named MIFCM_S and similarly MIFCM using Yager's function is named MIFCM_Y. Furthermore, cluster centers are updated using Equation (26).
MIFCM is an iterative process, and it stops when the convergence criteria are satisfied (i.e., the difference between the objective function value of successive iterations is less than the user-specified stopping criteria value).

Consensus Clustering Using Voting Schema
In this section, the segmentation results are combined using voting schema. Let n be the number of voxels presented X = {x 1 , x 2 , x 3 , . . . . . . , x n } and t be the set of clustering algorithms considered for clustering the n voxels, i.e., Π = {π 1 , π 2 ., , , , π t }. Each clustering algorithm π i maps x i to c clusters. The problem of consensus clustering is to find a new π * that best summarizes the clustering ensemble. In the proposed work, the input brain images voxels are segmented using the above-discussed four clustering algorithms. After convergence of each algorithm, each voxel is assigned to its corresponding cluster based on the membership value. Let U 1 , U 2 , U 3 , and U 4 represent the membership matrix of RSKFCM, GSKFCM, MIFCM_S, and MIFCM_Y, respectively. From these membership matrices, a label for each pixel is computed. The pixel x j is assigned a label of a cluster for which it has maximum membership value. Let P 1 , P 2 , P 3 , and P 4 be the label matrix created for RSKFCM, GSKFCM, MIFCM_S, and MIFCM_Y, respectively. From these label matrices, consensus results are produced using a voting method. The pixel x j is assigned to a cluster based on the maximum number of cluster labels, i.e., label = argmax i P

Experimental Results
This section presents the dataset used for experimentation, the metrics used to evaluate the proposed method, and the experimental setup followed by results and discussion.

Datasets
To assess the proposed method, we carried out experiments on two publicly available standard datasets.

OASIS
The Open Access Series of Imaging Studies (OASIS), is a publicly available standard MRI dataset (See the "Open Access Series of Imaging Studies" (OASIS) project's web site at https://www.oasis-brains.org/ (accessed on 19 April 2022)). This dataset consists of 416 cross-sectional data from subjects aged between 18 and 96. The images in the dataset are of 1.25 mm thickness and of 256 × 256 × 128 resolution.

IBSR18
The Internet Brain Segmentation Repository (IBSR18) (See the "Internet Brain Segmentation Repository" (IBSR) project's web site at https://www.nitrc.org/projects/ibsr/ (accessed on 19 April 2022)) was created by the Center for Morphometic Analysis at the Massachusetts General Hospital. IBSR18 contains 18 T1 weighted MR brain images and their corresponding segmentation ground truth images. The images have 1.55 mm thickness with a resolution of 256 × 256 × 128. All the images are bias field corrected using the Autoseg method developed by the University of North Carolina at Chapel Hill (See the "AutoSeg" repository https://www.nitrc.org/projects/autoseg/ (accessed on 19 April 2022)).

Evaluation Metrics
Usually, the segmentation results are evaluated for CSF, GM, and WM tissues using the following three evaluation metrics: overlap measure, boundary measure, and volume measure. In this paper, we evaluate our proposed method using all three measures.
Dice similarity Coefficient (DC): Dice similarity coefficient [35] is used to estimate the spatial overlap between the ground truth and the segmentation results, using the following equation.
where Seg_Im is the segmentation result of the proposed method, and GT_Im is the ground truth. Higher DC represents more accurate segmentation. Hausdorff Distance (HD): The Hausdorff distance [36] is used as the boundary measure, and it is calculated between the ground truth points ϕ and the segmented pointsφ using the following equation: The original Hausdorff distance is affected by outliers [37]. Thus, to reduce the influence of outliers, we used the 95th percentile of the Hausdorff distance. In the following, therefore, HD refers to the 95th percentile of the Hausdorff distance, and lower HD represents a more accurate result.
Absolute Volume Difference (AVD): Absolute Volume Difference is a volume measure used to compute volume difference between the ground truth and the obtained results. It is computed as follows: Lower AVD indicates a more accurate segmentation.

Experimental Setup
In this paper, we set the fuzzifier m value as two, stopping criterion ε to 0.0001, and initialized cluster centers randomly. We used voxel intensity as a feature. We let the window size N k vary in {3, 5, 7}. From the experiments, it is found that when K = 3, performance is better. Therefore, we set K = 3 in all the experiments. In addition, to set the value of α, we varied α from 0.1 to 1. From the experiments, it is found that when α = 0.9 performance is better, and this value was used in all the experiments. To assess the performance of the proposed method, we used 10-fold cross validation. The proposed model was implemented and experimented in MatLab 2016a.

Results
This section presents the results on the OASIS and IBSR18 datasets. The performance of the proposed method is compared with state-of-the-art methods. In addition, the performance is also compared with the latest version of standard brain segmentation tools such as FSL [38], SPM12 [39] and FreeSurfer [40]. The following methods are considered for comparison: • HMRF-EM [8]: This method combines hidden Markov random field (HMRF) with an EM algorithm for MRI image segmentation. The main advantage of this method is it derives how the spatial information is encoded through the mutual influences of neighboring sites. • FAST-PVE [41]: This method uses Markov random field(MRF) for brain tissue segmentation. To increase the speed, this method uses fast iterated conditional modes to solve MRFs. • MSSEG [42]: This method deal with images in the presence of WM lesions. This approach integrates a robust partial volume tissue segmentation with WM outlier rejection and filling, combining intensity and probabilistic and morphological prior maps. • R-FCM [43]: This method models the intensity inhomogeneities as a gain field that causes image intensities to smoothly and slowly vary through the image space. It iteratively adapts to the intensity inhomogeneities and is completely automated. • SFCM [44]: This method generalizes the objective function of a conventional FCM by incorporating a spatial penalty on the membership function. • FANTASM [45]: This method is the extension of an adaptive FCM. In this method, an additional constraint is placed on the membership functions that force them to be spatially smooth. • PVC [31]: This method uses a partial volume model for MRI brain tissue segmentation. First, it classifies nonbrain tissue using a combination of anisotropic diffusion filtering, edge detection, and mathematical morphology. Further, the local estimates are computed by fitting a partial volume tissue measurement model to histograms of neighborhoods about each estimate point. Voxels in the intensity-normalized image are then classified into six tissue types using a maximum a posteriori classifier. • SPM5 [46]: This method is based on a mixture of Gaussians. In addition, it is extended to incorporate a smooth intensity variation and nonlinear registration with tissue probability maps. • GAMIXTURE [47]: This method employs finite mixture models (FMMs) for brain tissue segmentation. To deal with FMM complex optimization, this method employs a global optimization algorithm that uses blended crossover and a new permutation operator. • ANN [48]: This method is based on a self-organizing map (SOM). Initially, the feature vector is extracted from the intensity of the pixel and its n nearest neighbors. Further, to improve the robustness, statistical transformation is applied to the extracted feature vector. Finally, each pixel is classified using SOM. • KNN [49]: This method uses K-NN for brain tissue segmentation. • BrainSuit09 [50]: This is an automatic brain image analysis tool. The tool provides a sequence of low-level operations in a single package that can produce accurate brain segmentation in clinical time. • SVPASEG [51]: This method uses local image models for brain tissue segmentation. This model combines the local models for tissue intensities and Markov Random Field (MRF) into a global probabilistic image model. Finally, the parameters for the local intensity models are obtained without supervision by maximizing the weighted likelihood of a certain finite mixture model. • EGC-SOM [52]: This method uses self-organizing maps (SOM) for brain tissue segmentation. Initially, first-and second-order features are extracted using overlapping windows. Further, evolutionary computing is used for feature selection. Finally, map units are grouped using SOM. • RF-CRF [53]: This method uses a conditional random field with a random forest for brain tissue segmentation. This method uses intensities, gradients, probability maps, and locations as features.

Results on OASIS Dataset
The OASIS dataset contains the images which are already skull stripped. Bias field correction was performed using the ROBEX tool [34]. Figure 1 shows the qualitative segmentation results obtained using the proposed method. We compared the results of the proposed method with the three state-of-the-art methods, i.e., HMRF-EM [8], FAST-PVE [41], and MSSEG [42]. All three methods' codes are available on the authors' websites. The comparison of their results is presented in Table 1. We notice that the proposed model has better performance with regard to CSF, GM, and WM when compared to the other methods.

Results on IBSR18 Dataset
The images in the IBSR18 are already bias field corrected. Hence, we have not applied any bias field correction technique. We conducted the experiments by removing the skull using a ground truth mask. Figure 2 shows the qualitative segmentation results obtained using the proposed method. The main limitation of the IBSR18 dataset is that it considers sulcal CSF as GM. The authors in [54] compared 10 existing methods without considering the sulcal CSF. Following [55,56], in our study we did not removed the sulcal CSF. We have compared the results of the proposed method with state-of-the-art methods. As all the considered methods have used DC alone as an evaluation metric, Table 2 shows the results only on the DC of the IBSR18 dataset. From this comparison, it is clear that the proposed model has better performance concerning CSF, GM, and WM when compared to the other methods.

Autism Spectrum Disorder Detection Using Proposed Method
Additionally, the proposed consensus clustering method has been evaluated on a practical autism spectrum disorder (ASD) detection problem. We used publicly available Autism Brain Imaging Data Exchange (ABIDE) data for this study. The ABIDE dataset contains 1112 subjects, 571 of them normal, and 531 of them with Autism Spectrum Disorders. We used 1054 of the 1112 subjects for this study, and the rest were rejected for improper segmentation using voxel-based morphometry (VBM). In this study, we employ a feature extraction method based on the VBM [57]. VBM is a fast and automatic method for determining the difference in gray matter structure between normal and and ASD patient brains [58]. In our VBM analysis, unified segmentation, smoothing, and statistical analysis were performed as preprocessing steps. In the unified segmentation step, tissue segmentation, bias correction, and image registration were combined in a single generative model [46]. The segmented and registered gray matter images were then smoothed by convolving with an isotropic Gaussian kernel. Here, a 10 mm full-width at half-maximum kernel was employed. A two-sample t-test was performed on the smoothed images, and gray matter volume was used as the covariate. This VBM analysis revealed significant gray matter volume increases in the normal persons in comparison with the ASD patients. The voxel location of significant regions were used as a mask. All segmented gray matter images were used to extract gray matter tissue probability values using a mask. A total of 989 features were obtained. and these were used as an input to the proposed method. Table 3 presents the performance comparison for Autism Spectrum Disorder Detection. The results of the proposed method are compared with traditional K-Means and variants of FCM methods. It is observed in Table 3 that the proposed method outperforms other methods.

Discussion
Brain images are very complex, largely uncertain, and imprecise. The fuzzy clustering based methods are capable of handling the aforementioned challenges.In this paper, we have combined the results from four variants of FCM clustering methods. The RSKFCM and GSKFCM are proven to be less sensitive to noise due to the use of kernel distance and the addition of neighborhood information. The MIFCM_S and MIFCM_Y are based on an intuitionistic fuzzy set which considers non-membership value along with membership value. Thus, in comparison to RSKFCM and GSKFCM, MIFCM methods handled the uncertainty better and achieved better results. Since we combined the advantages of all four clustering methods, our proposed consensus clustering method achieved better results compared to state-of-the-art methods.
On the OASIS dataset, the proposed method outperforms other methods in comparison. The OASIS dataset contains skull stripped T1 weighted MRI images. The main challenge in the OASIS dataset is the presence of WM lesions. The presence of WM lesions affects the overall segmentation accuracy of the proposed method. On the IBSR18 dataset, the proposed method outperforms all other methods in comparison. The images in the IBSR18 dataset are affected by acquisition artifacts which have direct impact on the WM tissue segmentation. On the other hand, lack of sulcal CSF labelling in the ground truth affects the GM and the CSF tissue segmentation results. Additionally, the proposed consensus clustering method has been evaluated on a practical autism spectrum disorder (ASD) detection problem. The proposed method outperforms other clustering algorithms. Even though the proposed consensus clustering algorithm is capable of handling noise and can exploit the spatial information in the image, it fails to capture the variations within the neighbourhood voxels. In addition, the time complexity of the proposed algorithm is more compared to individual clustering algorithms.

Conclusions
In this paper, a new approach for MRI Brain tissue segmentation is presented. The proposed method is based on the consensus clustering method. In consensus clustering, the results of four variants of fuzzy clustering methods are combined to achieve better results. The results of the proposed methods are evaluated using three performance metrics, i.e., DC, HD, and AVD. The competence of the proposed method is validated using two publicly available datasets: OASIS and IBSR18. From experimentation, it has turned out that our proposed method obtains the best result compared to other contemporary methods on the OASIS and IBSR18 datasets. Additionally, the proposed consensus clustering method has been evaluated on a practical autism spectrum disorder (ASD) detection problem.