A Kernel-Based Intuitionistic Fuzzy C-Means Clustering Using a DNA Genetic Algorithm for Magnetic Resonance Image Segmentation

Zang, Wenke; Zhang, Weining; Zhang, Wenqian; Liu, Xiyu

doi:10.3390/e19110578

Open AccessArticle

A Kernel-Based Intuitionistic Fuzzy C-Means Clustering Using a DNA Genetic Algorithm for Magnetic Resonance Image Segmentation

by

Wenke Zang

^1,*

,

Weining Zhang

²,

Wenqian Zhang

¹ and

Xiyu Liu

¹

School of Management Science and Engineering, Shandong Normal University, Jinan 250014, China

²

Department of Computer Science, University of Texas at San Antonio, San Antonio, TX 78249, USA

^*

Author to whom correspondence should be addressed.

Entropy 2017, 19(11), 578; https://doi.org/10.3390/e19110578

Submission received: 3 July 2017 / Revised: 17 October 2017 / Accepted: 24 October 2017 / Published: 27 October 2017

(This article belongs to the Section Information Theory, Probability and Statistics)

Download

Browse Figures

Versions Notes

Abstract

:

MRI segmentation is critically important for clinical study and diagnosis. Existing methods based on soft clustering have several drawbacks, including low accuracy in the presence of image noise and artifacts, and high computational cost. In this paper, we introduce a new formulation of the MRI segmentation problem as a kernel-based intuitionistic fuzzy C-means (KIFCM) clustering problem and propose a new DNA-based genetic algorithm to obtain the optimal KIFCM clustering. While this algorithm searches the solution space for the optimal model parameters, it also obtains the optimal clustering, therefore the optimal MRI segmentation. We perform empirical study by comparing our method with six state-of-the-art soft clustering methods using a set of UCI (University of California, Irvine) datasets and a set of synthetic and clinic MRI datasets. The preliminary results show that our method outperforms other methods in both the clustering metrics and the computational efficiency.

Keywords:

fuzzy C-means; intuitionistic fuzzy entropy; DNA genetic algorithm; images segmentation; MRI

1. Introduction

Segmentation of brain magnetic resonance images (MRIs) into non-overlapping regions of white matter (WM), gray matter (GM), and cerebrospinal fluid (CSF) is essential for studying anatomical structure changes and brain quantification [1] and for building tumor growth models. Due to existence of noise, bias field, and partial volume effect, brain MRI segmentation faces several challenging issues. Existing methods still suffer from a lack of robustness to outliers [2], a high computational cost [3], the needs for manual adjustment of crucial parameters [4], a limited segmentation accuracy in the presence of high level noise [5], and a loss of image details [6].

One promising approach to MRI segmentation is the soft clustering, an unsupervised learning, that groups similar patterns into clusters that have soft boundaries. By assigning each pixel to each cluster with a varying degree of membership, this approach is able to account for the uncertainty [7]. Several soft clustering methods have been proposed for MRI segmentation, including the well-known fuzzy C-means (FCM) clustering [8], mixture modeling [9], and some hybrid methods based on the former two methods [10]. Although the accuracies of these soft clustering algorithms are good in the absence of image noise, they are sensitive to noise and other imaging artifacts.

Recently, several methods [11,12,13] have been proposed to improve the FCM clustering on noise tolerance by representing clusters as intuitionistic fuzzy sets (IFSs) [14]. Aruna et al. presented a modified intuitionistic fuzzy C-means (IFCM) clustering algorithm [15], which adopts a new IFS generator and the Hausdorff distance. Verma et al. presented an improved intuitionistic fuzzy C-means (IIFCM) algorithm [16], which takes the local spatial information into consideration. IIFCM is able to tolerate noise and does not require any parameter tuning. However, these algorithms are still based on Euclidean or Hausdorff distance among pixels. As a result, these algorithms can only find linearly separable clusters and the clustering results are depending on the initial choice of centroids.

It is well-known that kernel functions can be used to find clusters that cannot be linearly separated [17,18]. However, the performance of those kernel-based methods is highly sensitive to the choice of kernel parameters [19]. Although several methods [20,21] have been proposed to estimate the optimal values for kernel parameter, the problem has not been completely solved.

We believe that by applying kernel functions to find the IFS-based FCM clustering, we can have a robust method for MRI segmentation. Using this approach, the segmentation problem can be transformed into an optimization problem: finding the optimal kernel parameters that lead to optimal noise-tolerant fuzzy clusters.

The DNA genetic algorithm, based on DNA computing [22] and the Genetic Algorithm (GA) [23], have been recently introduced to solve complex optimization problems in many areas, such as, chemical engineering process parameter estimation [24], function optimization [25], clustering analysis [26,27], and membrane computation [28]. This technique can be used to solve the aforementioned optimization problem.

Motivated by the previous discussion, we formulate an MRI segmentation problem as a kernel-based intuitionist fuzzy C-clustering problem and provide a DNA-based genetic algorithm for solving this problem. Specifically, our contributions in this paper are as follows.

We formulate an image segmentation problem as a kernel-based intuitionistic fuzzy C-means (KIFCM) clustering problem by specifying a new parametric objective function. This formulation includes a new measure for pixel local noise, a method to model fuzzy clusters as intuitionistic fuzzy sets instead of conventional fuzzy sets, and an adaptation of a kernel trick to improve performance.

We propose a new DNA-based genetic algorithm to learn the KIFCM clustering. This algorithm uses a DNA coding scheme to represent individuals (i.e., potential solutions) and a set of improved DNA genetic operator to search through the solution space for optimal solutions. Each individual encodes a set of values of the modeling parameters, including kernel parameters. While the algorithm searches for optimal set of model parameters, it also obtains the optimal IFS based fuzzy clusters.

We perform empirical study by comparing our method with six existing state-of-the-art fuzzy clustering algorithms using a set of UCI data mining data sets, a set of synthetic MRI data, and a set of clinical MRI datasets. Our preliminary results show that our algorithm outperforms the compared algorithms in both the clustering metrics and computational efficiency.

The rest of this paper is organized as follows: Section 2 presents several basic concepts such as IFSs, FCM and DNA-GA. Section 3 discusses the related work. In Section 4, we develop the objective function that formulate the kernel-based intuitionistic fuzzy C-means problem. In Section 5, we present our algorithm. In Section 6, we present results from experiments performed on UCI data, and synthetic and clinical brain MR images. In Section 7, we discuss computational performance of our and several related algorithms. Section 8 concludes the paper.

2. Preliminaries

In this section, we briefly review basic concepts that lay a foundation for discussions in latter sections. Interested reader should refer to references for more details.

2.1. Intuitionistic Fuzzy Sets (IFSs)

Given a set

X

, an intuitionistic fuzzy set [14] over

X

is defined as:

A = {x, u_{A} (x), v_{A} (x) | x \in X}

(1)

where

0 \leq u_{A} (x) \leq 1

is the membership degree and

0 \leq v_{A} (x) \leq 1

the non-membership degree of

x

with respect to

A

. In general, each element x in an intuitionistic fuzzy set also has a third parameter

0 \leq π_{A} (x) \leq 1

called the hesitation degree, which represents the lack of knowledge about the membership of x (i.e., the indecision how much are the degrees of membership and non-membership that

x

should have for the fuzzy set). These three degrees have the following relationship:

u_{A} (x) + v_{A} (x) + π_{A} (x) = 1

(2)

Notice that if there is no hesitation, i.e.,

π_{A} (x) = 0

or equivalently

v_{A} (x) = 1 - u_{A} (x)

, for every

x \in A

,

A

will also be a fuzzy set [29]. Furthermore, if

v_{A} (x) = π_{A} (x) = 0

for every

x \in A

,

A

will also be a crisp (i.e., normal) set.

2.2. Fuzzy C-Means

For a set of data

X = {x_{1}, x_{2}, \dots, x_{n}} \subset R^{d}

, where data

x_{k}

is a d-dimensional vector with real-valued elements, and

c > 1

, the fuzzy C-means clustering problem is to assign each data vector to the

c

fuzzy clusters, so that the total intra-cluster distance defined by the following objective function is minimized:

J_{F C M} (U, V) = \sum_{i = 1}^{c} \sum_{k = 1}^{n} {(u_{i k})}^{m} {‖ x_{k} - v_{i} ‖}^{2}

(3)

where:

$1 \leq m < \infty$ is a model parameter that determines the amount of fuzziness in the clustering.
$V = {v_{1}, v_{2}, \dots, v_{c}}$ is the set of cluster centroids.
$‖ x_{k} - v_{i} ‖^{2}$ is the Euclidean distance between $x_{k}$ and $v_{i}$ .
$u_{i k}$ is the membership degree of data $x_{k}$ in the fuzzy cluster with centroid $v_{i}$ .
$U = {[u_{i k}]}_{c \times n}$ is the membership matrix, which satisfies two conditions: (a) $\sum_{i = 1}^{c} u_{i k} = 1$ for each row k, and (b) $\sum_{k = 1}^{n} u_{i k} > 0$ for each column $i$ .

The optimal membership degrees and cluster centers can be obtained via an iterative process known as alternate optimization using the following equations:

{\begin{cases} u_{i k} = \frac{‖ x_{k} - v_{i} ‖^{- 1 / (m - 1)}}{\sum_{j = 1}^{c} ‖ x_{k} - v_{j} ‖^{- 1 / (m - 1)}} \\ v_{i} = \frac{\sum_{k = 1}^{n} {(u_{i k})}^{m} x_{k}}{\sum_{k = 1}^{n} {(u_{i k})}^{m}} \end{cases}

(4)

2.3. DNA Genetic Algorithm

A DNA genetic algorithm [24] solves an optimization problem defined as follows:

{\begin{cases} \min f (v_{1}, v_{2}, \dots, v_{n}) \\ subject to l_{i} \leq v_{i} \leq h_{i}, i = 1, 2, \dots, n \end{cases}

(5)

As shown in the following algorithm. It encodes each potential solution (referred to as an individual) as a DNA string and searches for the best individual by mimicking DNA genetic operations. It starts by generating at random an initial population of individuals and uses a set of DNA-based genetic operations, such as mutation and crossover, to generate new individuals. It uses the objective function

f ()

to compute the fitness value for individuals and evolves the population based on the genetic principle. The process repeats until a pre-determined termination condition is met. The best-fit individual is then decoded and returned as the best solution to the problem.

The generic DNA-GA, named Algorithm 1, is described below.

Algorithm 1: Generic DNA-GA

Input:

f (X)

: the objective function

n

: the number of variables in

X

d o m (X)

: domains of the n variables

N

: the size of initial population
Output: The value of

X

that optimizes

f (X)

Method:

POP = a population of $N$ randomly generated individuals encoded as DNA strings
For each $p$ in POP
Calculate the fitness value of $p$ using $f (d e c o d e (p))$
While termination condition is not satisfied do
NewPOP = selectBestFitIndividual(POP)
NewPOP = NewPOP $\cup$ generateNewIndividuals(POP)
POP = NewPOP
Return decode(bestFitIndividual)

In the Algorithm 1, an individual is a sequence of values for the n decision variables. Each variable is encoded as a DNA sequence of a pre-determined number of nucleotide bases. Each nucleotide base is represented by a base-4 digit representing one of the adenine (A), guanine (G), cytosine (C) and thymine (T). The fitness value of an individual is computed using the decoded variable values. The best fit individual in an iteration of the algorithm represents the best solution found at that point of time. At each iteration, the next generation of the population is created. The best-fit individual in the current generation is kept and new individuals are created by applying at random a set of DNA genetic operations to individuals randomly selected from the current generation of population.

By mimicking the DNA reproduction process, the genetic operations perform copying, mutation, and crossover on DNA strips. To avoid premature convergence towards local optimum solutions, the next generation must provide a sufficient variety by including not only the better fit individuals, but also individuals with very different genetic characteristics.

3. Related Work

The original FCM algorithm [8] is based on the objective function given in Equation (3). Since it does not include any local information, it is sensitive to noise in the image and loses clustering accuracy as noise and image artifacts increase.

The FCM_S algorithm [30] used the following modified objective function:

J_{F C M_S} (U, V) = \sum_{i = 1}^{c} \sum_{k = 1}^{n} {(u_{i k})}^{m} {‖ x_{k} - v_{i} ‖}^{2} + \frac{β}{N_{R}} \sum_{i = 1}^{c} \sum_{k = 1}^{n} {(u_{i k})}^{m} (\sum_{r \in N_{i}} {‖ x_{r} - v_{i} ‖}^{2})

(6)

in which the second term includes spatial information of neighborhood around pixels, where

N_{i}

is the set of pixels around pixel

x_{i}

,

N_{R}

is the average cardinality of

N_{i}

, and

0 < β < 1

is a parameter that controls the spatial information of neighbors.

Although it helps with handling noise, it is computationally expensive as the neighborhood term must be calculated in each iteration for each pixel.

The FCM_S1 and FCM_S2 algorithms [31] used a further improved objective function:

J_{F C M_S 1, S 2} (U, V) = \sum_{i = 1}^{c} \sum_{k = 1}^{n} {(u_{i k})}^{m} {‖ x_{k} - v_{i} ‖}^{2} + β \sum_{i = 1}^{c} \sum_{k = 1}^{n} {(u_{i k})}^{m} {‖ \bar{x_{k}} - v_{i} ‖}^{2}

(7)

in which the second term includes

{‖ \bar{x_{k}} - v_{i} ‖}^{2}

instead of

(1 / N_{R}) \sum_{r \in N_{i}} {‖ x_{r} - v_{i} ‖}^{2}

, where

\bar{x_{k}}

is the grayscale of a filtered image and only needs to be calculated once in advance and β is the same as in Equation (6). These algorithms also use kernel functions to calculate the Euclidean distance. The difference between the two versions of the algorithm is that FCM_S1 uses the average filter and FCM_S2 uses the median filter. Although FCM_S1 and FCM_S2 improved the clustering accuracy, these algorithms are sensitive to high level and different types of noises. In addition, the parameter β, which is critical to the performance, must be determined manually according to a prior knowledge about noise.

Yang and Tsai [21] proposed a Gaussian kernel-based FCM method. This method also has two forms: GKFCM1 and GKFCM2 for average and median filters, respectively. It replaces parameter β with a parameter η_j, which must be calculated in every iteration for every cluster. Using appropriate value of parameter η_j, this method could lead to better results than FCM_S1 and FCM_S2. However, to estimate a good value for η_j, cluster centroids must be well separated, which is difficult to guarantee. As a result, the algorithm may take many iterations to converge. Moreover, the learning scheme requires many patterns and many cluster centroids to find the optimal value for η_j. To handle this drawback, Elazab et al. [32] proposed an adaptively regularized kernel-based FCM framework (ARKFCM) with new parameter to control the local information of every pixel.

To overcome the problem of prior parameter adjustment, Krinidis and Chatiz [2] introduced a fuzzy local information C-means (FLICM) that includes a fuzzy factor

G_{i j}

in the objective function to represent the spatial and the grayscale information of the neighborhood of pixels. Although FLICM algorithm enhances robustness against noise and artifacts, it is computationally expensive because the fuzzy factor must be calculated in each iteration. The KWFLICM algorithm [33] enhances the FLICM algorithm by using a trade-off weighted fuzzy factor

G_{i j}^{'}

to control the local neighbor relationship and replace the Euclidean distance with kernel function. By using this factor, the KWFLICM algorithm can accurately estimate the damping extent of neighboring pixels, however, at the expense of substantially increasing the computational cost. In addition, the algorithm does not preserve small image details.

4. Problem Formulation

In this section, we formulate the FCM problem with a new objective function. We show step-by-step the derivation of this objective function. Starting from the basic objective function in Equation (7), we will modify some existing term or add new terms as new features are included to improve the accuracy and the performance of the algorithm for MRI segmentation. In this section, each data

x_{i}

is a vector in a multi-dimensional space (for example, coordinates and the greyscale of the pixel). To ease the presentation, we also refer

x_{i}

as the pixel

i

.

4.1. Local Intensity Variance

First, we improve the noise resistance by replacing the

β

in Equation (7) with a weight that measuring local uniformity. Let

N_{k}

be the neighborhood of a given size, e.g.,

3 \times 3

pixels, around pixel

x_{k}

. The local intensity variance (LIV) of the pixel

x_{k}

is defined as:

L I V_{k} = \frac{\sum_{j \in N_{k}} {(x_{j} - \bar{x_{k}})}^{2}}{| N_{k} |}

(8)

where

| N_{k} |

denotes the number of pixels in

N_{k}

, and

\bar{x_{k}}

is the mean grayscale taken over all pixels in

N_{k}

. LIV estimates the discrepancy of grayscales in the neighborhood normalized by the local average grayscale. A large LIV indicates higher level of noise within the neighborhood.

We then define a brightness weight for the pixel

x_{k}

as:

ω_{k} = \frac{ζ_{k}}{\sum_{j \in N_{k}} ζ_{j}}

(9)

where:

ζ_{k} = \exp (\sum_{j \in N_{k}, k \neq j} L I V_{j})

(10)

Intuitively,

ζ_{k}

measures the noise level of neighborhood

N_{k}

and

ω_{k}

is normalized over the neighborhood. Thus, a large

ω_{k}

indicates that pixel

x_{k}

is brighter than its neighbors.

Finally, we define the variance for the pixel

x_{k}

as:

φ_{k} = {\begin{cases} 2 + ω_{k}, \bar{x_{k}} < x_{k} \\ 2 - ω_{k}, \bar{x_{k}} > x_{k} \\ 0, \bar{x_{k}} = x_{k} \end{cases}

(11)

The variance

φ_{k}

is larger for those pixels with high LIV (or brighter than its neighbors) and smaller otherwise. If the local average grayscale is equal to the grayscale of the central pixel,

φ_{k}

will be zero. The constant 2 in Equation (11) is determined through experiments and provides a balance between the convergence rate and the capability to preserve details.

We can now rewrite the objective function as follows:

J_{F C M} (U, V) = \sum_{i = 1}^{c} \sum_{k = 1}^{n} {(u_{i k})}^{m} {‖ x_{k} - v_{i} ‖}^{2} + \sum_{i = 1}^{c} \sum_{k = 1}^{n} φ_{k} {(u_{i k})}^{m} {‖ \bar{x_{k}} - v_{i} ‖}^{2}

(12)

Since

φ_{k}

is relevant only to the grayscales within a specified neighborhood, which is independent of any cluster, it only needs to be calculated once before the clustering process. This is very different from FCM_S, FLICM, and KWFLICM, where the contextual information must be updated for each pixel and each cluster in each iteration. Thus, Equation (12) will result in a much low computational cost.

The contextual information provided by

φ_{k}

is based on the heterogeneity of grayscale distribution within the neighborhood. This is different from existing methods that base the contextual information on the difference of the grayscales between neighboring pixels and cluster centroids. As a result,

φ_{k}

tends to produce a homogeneous clustering while local information used in existing methods [33] tend to generate clustering with more misclassified labels.

Figure 1a shows an image with Rician noise (Level 15), in which the white rectangle is an area of

6 \times 6

pixels. Figure 1b shows the greyscale value of pixels in this area, as well as three

3 \times 3

neighborhood areas A, B and C. Figure 1c,d show, respectively, the LIV and

φ_{k}

for each pixel.

4.2. Intuitionistic Fuzzy C-Means Clustering

We now change the fuzzy clusters from conventional fuzzy sets to intuitionistic fuzzy sets. According to the definition given in Section 2.1, we re-write the objective function in Equation (12) as follows:

J_{I F C M} (U, V) = (\sum_{i = 1}^{c} \sum_{k = 1}^{n} {(u_{i k}^{*})}^{m} {‖ x_{k} - v_{i} ‖}^{2} + \sum_{i = 1}^{c} \sum_{k = 1}^{n} φ_{k} {(u_{i k}^{*})}^{m} {‖ \bar{x_{k}} - v_{i} ‖}^{2}) + \sum_{i = 1}^{c} π_{i}^{*} e^{1 - π_{i}^{*}}

(13)

Here

u_{i k}^{*} = u_{i k} + π_{i k}

is the membership for pixel

x_{k}

with respect to intuitionistic fuzzy cluster that has the centroid

v_{i}

,

u_{i k}

is the membership degree, and

π_{i k}

is a hesitation degree.

The second term in the Equation (13) is called the intuitionistic fuzzy entropy (IFE) and measures the vagueness or ambiguity in the intuitionistic clusters. It is included to minimize the entropy of the histogram of the given data and therefore to maximize the number of good points in the cluster.

We can calculate the hesitation degrees using Yager’s definition [20]:

ν (x) = {(1 - x^{α})}^{1 / α}

, where

α > 0

,

ν (1) = 0

, and

ν (0) = 1

. Thus:

π_{i k} = 1 - u_{i k} - {(1 - u_{i k}^{α})}^{1 / α}

(14)

and the variance

π_{i}^{*}

in the third term in Equation (13) is defined by:

π_{i}^{*} = \frac{1}{N} \sum_{k = 1}^{n} π_{i k}

(15)

Notice that the Yager class parameter

α

in Equation (14) controls the effect of local information and is crucial for the quality of the fuzzy clustering.

4.3. Kernel Intuitionistic FCM (KIFCM)

The Euclidean distance metric used in Equation (13) is valid under the assumption of uncorrelated cluster and cluster with the same spherical shape, that may be not true in real data. In addition, it is sensitive to perturbations and outliers. One way to address these problems is to use a different distance metrics, such as the Mahalanobis distance. Another way to deal with these problems is to use kernel functions to project the data into a higher dimensional space so that the data could be separated more easily [34]. An advantage of the latter method is that a so-called kernel trick can be used to transform linear algorithm into nonlinear one using dot product [35].

To improve segmentation accuracy and the robustness with respect to outliers, we utilize the kernel trick and replace the Euclidean distance

{‖ x_{k} - v_{i} ‖}^{2}

with

{‖ Φ (x_{k}) - Φ (v_{i}) ‖}^{2}

. Accordingly, the objective function in Equation (13) can be re-written as:

J_{K I F C M} (U, V) = (\sum_{i = 1}^{c} \sum_{k = 1}^{n} {(u_{i k}^{*})}^{m} {‖ Φ (\bar{x_{k}}) - Φ (v_{i}) ‖}^{2} + \sum_{i = 1}^{c} \sum_{k = 1}^{n} φ_{k} {(u_{i k})}^{m} {‖ Φ (\bar{x_{k}}) - Φ (v_{i}) ‖}^{2}) + \sum_{i = 1}^{c} π_{i}^{*} e^{1 - π_{i}^{*}}

(16)

where

Φ

is an implicit nonlinear mapping and

{‖ Φ (x_{k}) - Φ (v_{i}) ‖}^{2}

is the squared Euclidean distance between mapped pixels

Φ (x_{k})

and

Φ (v_{i})

in a feature space, which can be calculated using the kernel function in the input space as follows:

{‖ Φ (x_{k}) - Φ (v_{i}) ‖}^{2} = K (x_{k}, x_{k}) + K (v_{i}, v_{i}) - 2 (x_{k}, v_{i})

(17)

where

K ()

is a kernel function.

A widely-used kernel function is the Gaussian radial basis function (GRBF) [19]:

K (x_{k}, v_{i}) = \exp (\frac{- {‖ x_{k} - v_{i} ‖}^{2}}{σ^{2}})

(18)

where the kernel parameter

σ

determines the spread of the kernel and

‖ x_{k} - v_{i} ‖^{2}

is the Euclidean distance between pixels in the original space. Using this kernel function, Equation (17) can be rewritten as:

{‖ Φ (x_{k}) - Φ (v_{i}) ‖}^{2} = 2 (1 - K (x_{k}, v_{i}))

(19)

Substituting Equation (19) into Equation (16), we obtain our final objective function:

J_{K I F C M} (U, V) = 2 (\sum_{i = 1}^{c} \sum_{k = 1}^{n} {(u_{i k}^{*})}^{m} (1 - K (x_{k}, v_{i})) + \sum_{i = 1}^{c} \sum_{k = 1}^{n} φ_{k} {(u_{i k})}^{m} (1 - K (\bar{x_{k}}, v_{i}))) + \sum_{i = 1}^{c} π_{i}^{*} e^{1 - π_{i}^{*}}

(20)

Subject to the conditions required of

U

and

V

(see Section 2.2) and given parameters

m, σ

and

α

, Equation (20) will be minimized for membership degrees:

u_{i k} = \frac{{((1 - K (x_{k}, v_{i})) + φ_{k} (1 - K (\bar{x_{k}}, v_{i})))}^{- 1 / (m - 1)}}{\sum_{j = 1}^{c} {((1 - K (x_{k}, v_{j})) + φ_{k} (1 - K (\bar{x_{k}}, v_{j})))}^{- 1 / (m - 1)}}

(21)

and centroids of intuitionistic fuzzy clusters:

v_{i} = \frac{\sum_{k = 1}^{n} {(u_{i k})}^{m} \times (K (x_{k}, v_{i}) x_{k} + φ_{k} K (\bar{x_{k}}, v_{i}) \bar{x_{k}})}{\sum_{k = 1}^{n} {(u_{i k})}^{m} (K (x_{k}, v_{i}) + φ_{k} K (\bar{x_{k}}, v_{i}))}

(22)

We call this formulation of the FCM clustering as the kernel-based Intuitionistic FCM (KIFCM) clustering.

5. The DNA Genetic Algorithm

In this section, we present a DNA genetic algorithm for the kernel-based intuitionistic fuzzy C-means clustering problem, which we call Algorithm2: KIFCM-DNAGA. This algorithm is a specialization of the generic DNA-GA presented in Section 2.3.

Depending on the calculation of

{\bar{x}}_{i}

in Step 9, there can be variations of this algorithm. Specifically, if

{\bar{x}}_{i}

is the average greyscale, we call this algorithm KIFCM1-DNAGA and if

{\bar{x}}_{i}

is the median greyscale the algorithm is denoted as KIFCM2-DNAGA. We will discuss more details in the following subsections.

KIFCM-DNAGA contains three nested loops. The outer most loop iterates the evolution process, and is typically controlled by a user-specified constant, such as the number of iterations. Each of the inner loops is repeated for each of the

n

pixels. In the inner most loop, each pixel needs to be compared to the centroids of the

c

fuzzy clusters and the distance calculations involve the

d

dimensions. Thus, the complexity of this algorithm is

O (c n^{2} d)

. Notice that the kernel matrix is computed only once at the very beginning of the program.

The detailed description of KIFCM-DANGA is presented in Algorithm 2 below.

Algorithm 2: KIFCM-DNAGA

Input:

ϵ

: convergence threshold

t

: maximum number of iterations

c

: the number of fuzzy clusters

K

: a kernel function

M

: an MRI image

N

: the size of the population
Output:

U

: the membership matrix

V

: the centroids of fuzzy clusters
Method:

POP = create $N$ randomly generated individuals encoded as DNA strings
$t = 0$ ; Initialize $U^{(0)}$ and $V$
While $t = 0$ or $‖ U^{t} - U^{t - 1} ‖ > ϵ$ do
$t = t + 1$
for $p \in P O P$ do
decode $m$ , $α$ , and $σ$
for every pixel $x_{k} \in M$
Calculate $φ_{k}$ using Equation (11)
Calculate ${\bar{x}}_{i}$
Calculate $u_{i k}$ and $v_{i}$ using Equations (21) and (22), respectively
Calculate fitness value for $p$ using Equation (20)
V = V of the best fit individual in POP
NewPOP = apply selection op on POP
NewPOP = NewPOP $\cup$ apply DNA genetic operations on POP
POP = apply reconstruction op NewPOP
Return $U^{t}$ and $V$

5.1. DNA Encoding and Decoding

Recall that if values of the model parameters

m, σ

and

α

are given, Equations (21) and (22) can be used to calculate the fitness value, and the clustering (i.e., the membership degrees and cluster centroids). The optimal clustering is obtained provided the optimal values for these parameters are given. As pointed out by Chaira [12] , the values of parameters

m

and

α

will affect the performance of IFCM, and by Graves and Pedrycz [36], the value of parameter

σ

in a kernel function will affect the performance. Our algorithm is designed to find the optimal values for parameters

m, σ

and

α

, and to obtain the optimal clustering along the way.

In this algorithm, an individual, a potential solution, consists of values for three variables

v_{1}

representing parameter

σ

,

v_{2}

representing

α

, and

v_{3}

representing

m

. We encode each variable as a string of

n

base-4 digits, where n is a system parameter and each digit represents a nucleotide bases, for example, 0 for A, 1 for G, 2 for C, and 3 for T. Figure 2 shows an example of an encoded individual, where n = 6.

For decoding, each variable

v_{i}

is mapped into a decimal number as follows:

c v_{i} = \sum_{j = 1}^{n} b i t (j) \times 4^{j - 1}

(23)

where

b i t (j)

is the j-th digit from the left of the current encoding segment for

v_{i}

. Depending on the bounds of the variable, the value of the variable is obtained as follows:

v_{i} = \frac{c v_{i}}{4^{n} - 1} (h_{i} - l_{i}) + l_{i}

(24)

where

l_{i}

and

h_{i}

are the lowest and the highest values of the variable

v_{i}

, and

(h_{i} - l_{i}) / 4^{n} - 1

is the precision of the decoded value for

v_{i}

.

For example, in Figure 2 the encoding of

α (i . e ., v_{2})

is the base-4 number 120132, representing GCAGTC. The corresponding decimal number is

c v_{2} = 1566

. If the range of

α

is

[1, 15]

, the decoded value of

α

is 6.35.

5.2. The Initialization

In Step 1,

N

individuals encoded as described in the previous subsection are randomly generated to form the initial population. For each nucleotide base in an individual’s DNA strand, we randomly assign one of A, C, G, and T (in fact, the base-4 digit corresponding to these amino acids). Unlike existing DNA-GAs which assign A, C, G, T with an equal probability 0.25, our algorithm assigns A, C, G, T with differential probabilities 0.156, 0.343, 0.344, and 0.157, respectively. These probabilities are chosen according to the nature of biological structures [37].

5.3. Selection Operator

In Step 14, a 50% of the current population is selected to participate in genetic operations that generate new individuals in the next generation. In general, three types of selection methods, namely roulette wheel, ranking and tournament have been used in various DNA-GAs. To be specific, we describe the tournament selection here. In this method, pairs of individuals from the current population are randomly selected and for each pair, the individual with the higher fitness value is selected. In addition, the so-called elitism strategy is also applied, which will include the best fit individual among the selected individuals.

5.4. DNA Genetic Operators

In Step 15, different types of genetic operations are applied randomly to randomly selected individuals in the new population. These operations are applied in sequence to evolve the new generation. The operations are discussed in the following subsections. Each of these operations will generate offspring individuals from parent individuals and preserve the size of the generation. After an operator is applied to the parent individuals, the offspring individuals survive and participate in the next operation.

5.4.1. Crossover Operator

A crossover operator generates two offspring individuals from two randomly selected parent individuals by swapping and mixing nucleotide bases of randomly selected DNA segments of the two parents.

Figure 3 shows an example illustrating the crossover operator. To ease the presentation. Only two variables are shown for individuals and each variable has a sequence of 6 nucleotide bases. Assume that the shaded segments in the two parents are randomly selected. To generate the two offspring individuals, the corresponding nucleotide bases are swapped according to the “choose big” and “choose small” strategies [38]. Namely, the offspring 1 is created by choosing in each nucleotide bases the larger values between the two parents and the offspring 2 the smaller values. Consider the shaded segment in

v_{2}

in the two parents. The sequences are 211 and 130. The larger digits for each position in these two sequences are respectively 2, 3, and 1. Similarly, the smaller digits are 1, 1, and 0. Thus, in this segment, offspring 1 has the sequence 231 and offspring 2 has the sequence 110.

5.4.2. Mutation Operator

A mutation operator generates an offspring from a parent by randomly changing (mutating) the digit at a randomly selected nucleotide base.

Unlike existing GAs which select position to mutate with an equal probability, we adapt a strategy of a shifting probability. Specifically, we divide the DNA strand of an individual into a high bit section and a low bit section, and mutate digits in different sections using the following probabilities:

P_{m h}

for high bit section and

P_{m l}

for low bit section:

{\begin{cases} P_{m h} = a_{1} + \frac{b_{1}}{1 + \exp [c (g - g_{0})]} \\ P_{m l} = a_{1} + \frac{b_{1}}{1 + \exp [- c (g - g_{0})]} \end{cases}

(25)

where

a_{1}

is the initial mutation probability,

b_{1}

is the range of mutation probability,

g

is the evolution generation,

g_{0}

is the generation where a greater mutation probability occurs and

c

is the speed of change.

The idea is inspired by the DNA “hot spots” and “cold spots” in biology [39]. Nucleotide bases in “cold spots” mutate more slowly than those in “hot spots”. To mimic this phenomenon,

P_{m h}

and

P_{m l}

are designed to change with the evolution stage. At the beginning of the evolution

P_{m h}

will be larger than

P_{m l}

. As the evolution progresses,

P_{m h}

will be gradually reduced and

P_{m l}

increased.

5.4.3. Reconstruction Operator

In Step 16, the reconstruction operator is applied to every individual in the new population with a probability

P_{r}

. The reconstruction operator generates an offspring by replacing each nucleotide base in a randomly selected segment of each variable by the complementary nucleotide base, namely A and T replace each other and so do G and C. This operation is inspired by the DNA double helix complementary principle [40].

Figure 4 shows an example of the reconstruction operator. On the left, the nucleotide symbols are shown in the parent and the offspring and on the right the encoded base-4 numbers are shown for the same parent and offspring. Assume that the right-most 3 bits of each variable are randomly chosen to perform the operation based on Waston-Crick [41] pairs. The nucleotide bases ATG (encoded in 031) in

v_{1}

, and GCT (encoded in 123) with

v_{2}

in a parent are changed to TAC (encoded in 302) and CGA (encoded in 210), respectively, in the offspring.

After the complementary operation is applied to the new generation, new random individuals will be generated as needed to keep the population size at

N

.

6. Experiments and Results

In this section, we present results from our experiments.

6.1. Experiment Setup

We compared our algorithm with six state-of-the-art soft clustering algorithms, including GKFCM1 [21], GKFCM2 [21], FLICM [2], KWFLICM [33], MICO [6], and RSCFCM [5].

These algorithms are compared using three types of data: a set of UCI machine learning datasets [42], a set of synthetic MRI data, and a set of clinical MRI data. The UCI datasets include Haberman’s Survival Data, Contraceptive Method Choice, Wisconsin Prognostic Breast Cancer, and SPECT Heart Data, with details are shown in Table 1.

The synthetic brain MR images includes a T1-weighted axial slice with 217 × 181 pixels corrupted with 7% noise and 20% grayscale, a T1-weighted sagittal slice with 181 × 217 pixels corrupted with 7% noise and 20% grayscale, and a T1-weighted axial slice with 217 × 181 pixels corrupted with 10% Rician noise.

The clinical brain MR images include two collections of datasets. The first collection is the dataset used by MICCAI BRATS 2014 challenge [1]. This dataset contains 220 clinical samples and each sample has a sequence of 155 T1-weighted axial slice images of 240 × 240 pixels. We randomly selected three samples (whose files are pat266_1, pat192_1, and pat013_1) and among the 155 slice images of each selected sample, we selected one slice that contains the most details and features and has the best visual quality (specifically, the 80-th, 86-th, and 90-th slice from the three samples, respectively). The second collection is the MR images collected using a Philips Medical Systems Intera 3T instrument obtained from the brain development project [43]. This collection contains 581 samples, and for each sample from 150 to 200 slices. We also randomly selected a sample (the 160-th sample) and a coronal slice (the 80th slice), which is of 512 × 300 pixels.

We measure the performance of the clustering algorithms using three metrics: the Jaccard Similarity (JS) [44], adjusted mutual information (AMI) and adjusted rand index (ARI) [45].

The Jaccard Similarity is defined by:

$J S (S_{1}, S_{2}) = \frac{| S_{1} \cap S_{2} |}{| S_{1} \cup S_{2} |}$

(26)

where S₁ is the segmented volume and S₂ is the ground truth volume.
AMI is defined as follows:

$A M I = \frac{M I - E (M I)}{\sqrt{H (S_{1}) \cdot H (S_{2})} - E (M I)}$

(27)

where $H (S_{1})$ , $H (S_{2})$ are the entropies for $S_{1}$ and $S_{2}$ , respectively. Mutual information MI quantifies the value of information shared between the two random variables and can be defined using the entropy definitions. $E (M I) = \sum_{Γ} M I (Γ) P (Γ)$ .
ARI is defined as follows:

$A R I = \frac{2 (N_{00} N_{11} - N_{01} N_{10})}{(N_{00} + N_{01}) (N_{01} + N_{11}) + (N_{00} + N_{10}) (N_{10} + N_{11})}$

(28)

where N₁₁ denotes the number of pairs that are in the same cluster in both U and V, N₀₀ denotes the number of pairs that are in different clusters in both U and V, N₀₁ denotes the number of pairs that are in the same cluster in U but in different clusters in V, and N₁₀ denotes the number of pairs that are in different clusters in U but in the same cluster in V.

These metrics are commonly used to compare the performance of clustering algorithms. For all three matrices, a higher value indicates a better clustering result, with the highest value being 1.

We implemented the algorithms in MATLAB. The experiments were conducted using neighborhood of 3 × 3 pixels. The termination conditions include a maximum number of iterations 𝑡 = 100 and convergence threshold 𝜀 = 0.001.

6.2. Results on UCI Datasets

In this experiment, we investigate how our algorithm performs for generic FCM clustering by using the set of UCI machine learning datasets (see Table 1). The same UCI datasets are commonly used in machine learning research [36]. Because these are not image data, we set the brightness variance

φ_{k}

to 0 in these experiments.

As shown in Table 2, KIFCM1-DNAGA can obtain a higher AMI and ARI than other algorithms, especially for datasets of a higher complexity. Moreover, algorithms using kernel functions show a better performance than those traditional algorithms, such as MICO, RSCFCM. The optimized results are marked in bold.

6.3. Results on Synthetic Brain MR Images

We performed several experiments using the Simulated Brain Database (SBD) [46], which contains a set of realistic MR volumes produced by an MR imaging simulator. These data come with the ground truth, i.e., pixels are labeled for CSF, GM, and WM regions.

The first experiment is to segment a T1-weighted axial slice with 217 × 181 pixels corrupted with 7% noise and 20% grayscale non-uniformity into WM, GM, and CSF. Figure 5 shows the segmentation results and Table 3 summarizes the JS and average running times.

According to Table 3, KIFCM1-DNAGA and KIFCM2-DNAGA outperform the other 6 algorithms. On average, the improvement on JS is about 1.3–4.8%. The best results are marked in bold.

The second experiment is to segment a T1-weighted sagittal slice with 181 × 217 pixels corrupted with 7% noise and 20% grayscale non-uniformity. This image is chosen to show the capability of preserving details. The segmentation results are shown in Figure 6 and JS are presented in Table 4.

Based on Table 4, the average JSs of KIFCM1-DNAGA and KIFCM2-DNAGA are better than the other algorithms by about 1.4–8.9%. The best results are marked in bold.

The third experiment is to segment a T1-weighted axial slice (number 91) with 217 × 181 pixels corrupted with 10% Rician noise. The Rician noise commonly affects MR images [47]. The segmentation results are shown in Figure 7 and the JS with average running times is given in Table 5. The best results are marked in bold.

Again, the average JSs of KIFCM1-DNAGA and KIFCM2-DNAGA are better than the other algorithms (by about 1.20 to 3.6%). The higher JSs implies that our algorithms do a better job in preserving image details in the presence of noise and grayscale homogeneity. The best results are marked in bold.

6.4. Results on Clinical Brain MR Images

In these experiments, we used MR images from two collections of clinical MRI datasets. We first used three T1-weighted axial slices (slice rank 80, 86, and 90, denoted as Brats1, Brats2, and Brats3, respectively) of 240 × 240 pixels included in MICCAI BRATS 2014 challenge. In these images, pixels with grayscale from black to white represent respectively, the background, CSF, GM, and WM. It is worth noting that clustering is carried out only for CSF, GM, and WM, with the pathology region being treated as the background. We then used a coronal slice (denoted as Brats4) with 512 × 300 pixels [43].

Figure 8 shows the results for Brats1 and Figure 9 shows the results for Brats2. The area within the red circles are enlarged to show more details. Figure 10 and Figure 11 show the results for Brats3 and Brats4.

Since the images come only with the ground truth for the pathology and do not have the ground truth for normal tissues, we also used the entropy-based metric [48] to measure the quality of the algorithms. This metric aims to maximize the uniformity of pixels within each segmented region and to minimize the uniformity across the regions. The entropy of an image region 𝑗 is defined as:

H (R_{j}) = - \sum_{m \in V_{j}} \frac{L_{j} (m)}{S_{j}} \log \frac{L_{j} (m)}{S_{j}}

(29)

where

V_{j}

is the set of all possible grayscales in region

j

,

L_{j} (m)

is the number of pixels belonging to region

j

with grayscale

m

, and

S_{j}

is the area of region

j

. For a grayscale image

j

, the 𝐸 measure is defined as:

E = \sum_{j = 1}^{c} (\frac{S_{j}}{S_{t}}) H (R_{j}) + (- \sum_{j = 1}^{c} (\frac{S_{j}}{S_{t}}) \log \frac{S_{j}}{S_{t}})

(30)

where the first term represents the expected region entropy of the segmentation and the second term is the layout entropy. Notice that smaller 𝐸 indicates a better segmentation. Table 6 shows the segmentation accuracy in terms of the

E

measure as well as the running times of the algorithms. The 𝐸 values are calculated only for WM, GM, and CSF regions. The best results are marked in bold.

From Table 6, our algorithms achieve smaller 𝐸 than other algorithms. Visually, GKFCM1 (Subfigure b in Figure 8, Figure 9, Figure 10 and Figure 11) and GKFCM2 (Subfigure c in Figure 8, Figure 9, Figure 10 and Figure 11) achieve good results but they are unable to preserve fine details. For example, CSF was incorrectly broken, possibly due to the difficulty in estimating parameter η_j. On the other hand, FLICM (Subfigure d in Figure 8, Figure 9, Figure 10 and Figure 11) and KWFLICM (Subfigure e in Figure 8, Figure 9, Figure 10 and Figure 11) yield smooth areas but they do not preserve fine details. For example, CSF in those results is smaller than the surrounding GM and WM. This may be because the difficulty to estimate optimal values for parameters G_ij and

G_{i j}^{'}

. For MICO algorithm (Subfigure f in Figure 8, Figure 9, Figure 10 and Figure 11), the edges are not smooth enough, hence the CSF is not well preserved. For RSCFCM algorithm (Subfigure g in Figure 8, Figure 9, Figure 10 and Figure 11), many details are missing. In comparison, KIFCM1-DNAGA (Subfigure h in Figure 8, Figure 9, Figure 10 and Figure 11) and KIFCM2-DNAGA (Subfigure i in Figure 8, Figure 9, Figure 10 and Figure 11) show good balance between smooth borders and image details. It achieves better balance in preserving image details in presence of different noise than the 6 other algorithms. Especially, even with a larger MR image, the proposed approach still performs well on E and running time measure.

7. Discussion

In this section, we discuss and compare the computational complexity and running times of ours and the other algorithms used in our experiments.

Table 7 shows a summary of the computational complexity of the eight algorithms. Assuming all algorithms go through exactly the same number of rounds of evolution, the complexity of these algorithms differs mainly in the number of times each pixel is processed within a single round of evolution.

The GKFCM requires

O (c n d^{2})

to compute the partition matrix in each iteration [36]. If the kernel matrix is computed at the beginning of each iteration, the complexity of the partition matrix is

O (c n d)

. Each of the FLICM, MICO, and RSCFCM requires

O (c n d)

for each iteration, in which the kernel matrix requires the kernel function be evaluated

c n

times. The complexity for the kernel algorithms depends on the kernel used. The complexity given in Table 7 is for the Gaussian radial basis function. It is also assumed that the kernel matrix is computed once rather than at every iteration for the KIFCM-DNAGA algorithm.

To compare the efficiency of the algorithms, we measured the running time (in seconds) averaged over 10 runs for Brain MR images experiments mentioned in Section 6. For each run, the initial population is made identical for all algorithms. Figure 12 shows the results.

As shown in Figure 12, GKFCM1 and GKFCM2 have the longest running time. This is perhaps due to the parameter η_j used by the algorithms, which needs an additional loop over all clusters to update the local contextual information. MICO comes next with running times span 1.3 to 2.85 s. It performs fast calculations owing to the convexity of its energy function particularly in the presence of less noise, but tends to have many iterations in the presence of high level noise. RSCFCM spent about 1.05 to 2.95 s for the experiments. It uses a spatial fuzzy factor that is constructed based on the posterior and prior probabilities and takes the spatial direction into account. That is likely to increase the computational cost. FLICM and KWFLICM have similar performance. FICM uses parameter G_ij, which needs an additional loop over the neighborhood of the current pixel to calculate the local information in every iteration, and KWFLICM used parameter

G_{i j}^{'}

, which requires two additional loops to process the neighborhood. Finally, KIFCM1-DNAGA and KIFCM2-DNAGA both incorporate local contextual information obtained by LVC, which only needs to be calculated once. As a result, both algorithms require about 0.25–1.55 s to run the experiments, which is the lowest among the algorithms.

8. Conclusions

In this paper, we introduce a new formulation of the MRI segmentation problem as a kernel-based intuitionistic fuzzy C-means (KIFCM) clustering problem and proposed a new DNA-based genetic algorithm to learn the optimal KIFCM clustering. While this algorithm searches the solution space for the optimal model parameters, it also obtains the optimal clustering, therefore the optimal MRI segmentation. We perform empirical study by comparing our method with six state-of-the-art soft clustering methods using a set of UCI data mining data sets and a set of synthetic and clinic MRI data. Our preliminary results show that our method outperforms other methods in both the clustering metrics (against the ground truth) and the computational efficiency. For the future work, we plan to further improve the KIFCM-DNAGA and test it on a larger set of clinical MRIs.

Acknowledgments

The work was done while the first author was visiting the University of Texas at San Antonio, USA. The authors would like to express thanks to the editor and the reviewers for their insightful suggestions. This research is supported in part by Excellent Young Scholars Research Fund of Shandong Normal University, China; the National Science Foundation of China (Nos. 61472231, 61402266); the Jinan Youth Science and Technology Star Project under Grant 20120108; and the soft science research on national economy and social information of Shandong, China under Grant (2015EI013).

Author Contributions

All four authors made significant contributions to this research. Together, they conceived the ideas, designed the algorithm, performed the experiments, analyzed the results, drafted the initial manuscript, and revised the final manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

Elazab, A.; AbdulAzeem, Y.M.; Wu, S.; Hu, Q. Robust kernelized local information fuzzy C-means clustering for brain magnetic resonance image segmentation. J. X-ray Sci. Technol. 2016, 24, 489–507. [Google Scholar] [CrossRef] [PubMed]
Krinidis, S.; Chatzis, V. A Robust Fuzzy Local Information C-Means Clustering Algorithm. IEEE Trans. Image Process. 2010, 19, 1328–1337. [Google Scholar] [CrossRef] [PubMed]
Nikou, C.; Galatsanos, N.P.; Likas, A.C. A class-adaptive spatially variant mixture model for image segmentation. IEEE Trans. Image Process. 2007, 16, 1121–1130. [Google Scholar] [CrossRef] [PubMed]
Cai, W.L.; Chen, S.C.; Zhang, D.Q. Fast and robust fuzzy c-means clustering algorithms incorporating local information for image segmentation. Pattern Recogn. 2007, 40, 825–838. [Google Scholar] [CrossRef]
Li, C.M.; Gore, J.C.; Davatzikos, C. Multiplicative intrinsic component optimization (MICO) for MRI bias field estimation and tissue segmentation. Magn. Reson. Imaging 2014, 32, 913–923. [Google Scholar] [CrossRef] [PubMed]
Ji, Z.X.; Liu, J.Y.; Cao, G.; Sun, Q.S.; Chen, Q. Robust spatially constrained fuzzy c-means algorithm for brain MR image segmentation. Pattern Recogn. 2014, 47, 2454–2466. [Google Scholar] [CrossRef]
Despotovic, I.; Goossens, B.; Philips, W. MRI Segmentation of the Human Brain: Challenges, Methods, and Applications. Comput. Math. Method Med. 2015. [Google Scholar] [CrossRef] [PubMed]
Bezdek, J.C.; Ehrlich, R.; Full, W. FCM: The fuzzy c-means clustering algorithm. Comput. Geosci. 1984, 10, 191–203. [Google Scholar] [CrossRef]
Huang, T.; Dong, W.S.; Xie, X.M.; Shi, G.M.; Bai, X. Mixed Noise Removal via Laplacian Scale Mixture Modeling and Nonlocal Low-Rank Approximation. IEEE Trans. Image Process. 2017, 26, 3171–3186. [Google Scholar] [CrossRef] [PubMed]
Prakash, R.M.; Kumari, R.S.S. Spatial Fuzzy C Means and Expectation Maximization Algorithms with Bias Correction for Segmentation of MR Brain Images. J. Med. Syst. 2017, 41, 15. [Google Scholar] [CrossRef] [PubMed]
Iakovidis, D.K.; Pelekis, N.; Kotsifakos, E.; Kopanakis, I. Intuitionistic Fuzzy Clustering with Applications in Computer Vision. Lect. Notes Comput. Sci. 2008, 5259, 764–774. [Google Scholar]
Chaira, T. A novel intuitionistic fuzzy C means clustering algorithm and its application to medical images. Appl. Soft Comput. 2011, 11, 1711–1717. [Google Scholar] [CrossRef]
Bhargava, R.; Tripathy, B.; Tripathy, A.; Dhull, R.; Verma, E.; Swarnalatha, P. Rough intuitionistic fuzzy C-means algorithm and a comparative analysis. In Proceedings of the 6th ACM India Computing Convention, Vellore, India, 22–25 August 2013; p. 23. [Google Scholar]
Atanassov, K.T. Intuitionistic Fuzzy-Sets. Fuzzy Set. Syst. 1986, 20, 87–96. [Google Scholar] [CrossRef]
Aruna Kumar, S.; Harish, B. A Modified Intuitionistic Fuzzy Clustering Algorithm for Medical Image Segmentation. J. Intell. Syst. 2017. [Google Scholar] [CrossRef]
Verma, H.; Agrawal, R.; Sharan, A. An improved intuitionistic fuzzy c-means clustering algorithm incorporating local information for brain image segmentation. Appl. Soft Comput. 2016, 46, 543–557. [Google Scholar] [CrossRef]
Zhang, D.-Q.; Chen, S.-C. Clustering incomplete data using kernel-based fuzzy C-means algorithm. Neural Process. Lett. 2003, 18, 155–162. [Google Scholar] [CrossRef]
Zhang, D.-Q.; Chen, S.-C. A novel kernelized fuzzy c-means algorithm with application in medical image segmentation. Artif. Intell. Med. 2004, 32, 37–50. [Google Scholar] [CrossRef] [PubMed]
Souza, C.R. Kernel functions for machine learning applications. Creat. Commons Attrib. Noncommer. Share Alike 2010, 3, 29. [Google Scholar]
Lin, K.-P. A novel evolutionary kernel intuitionistic fuzzy C-means clustering algorithm. IEEE Trans. Fuzzy Syst. 2014, 22, 1074–1087. [Google Scholar] [CrossRef]
Yang, M.S.; Tsai, H.S. A Gaussian kernel-based fuzzy c-means algorithm with a spatial bias correction. Pattern Recogn. Lett. 2008, 29, 1713–1725. [Google Scholar] [CrossRef]
Adleman, L.M. Molecular Computation of Solutions to Combinatorial Problems. Science 1994, 266, 1021–1024. [Google Scholar] [CrossRef] [PubMed]
Goldberg, D.E. Genetic Algorithms in Search, Optimization, and Machine Learning; Addison-Wesley: Boston, MA, USA, 1989. [Google Scholar]
Tao, J.; Wang, N. DNA computing based RNA genetic algorithm with applications in parameter estimation of chemical engineering processes. Comput. Chem. Eng. 2007, 31, 1602–1618. [Google Scholar] [CrossRef]
Zang, W.; Zhang, W.; Zhang, W.; Liu, X. A Genetic Algorithm Using Triplet Nucleotide Encoding and DNA Reproduction Operations for Unconstrained Optimization Problems. Algorithms 2017, 10, 16. [Google Scholar] [CrossRef]
Zang, W.; Ren, L.; Zhang, W.; Liu, X. Automatic Density Peaks Clustering Using DNA Genetic Algorithm Optimized Data Field and Gaussian Process. Int. J. Pattern Recogn. 2017, 31, 1750023. [Google Scholar] [CrossRef]
Zang, W.; Jiang, Z.; Ren, L. Improved spectral clustering based on density combining DNA genetic algorithm. Int. J. Pattern Recogn. 2017, 31, 1750010. [Google Scholar] [CrossRef]
Zang, W.; Sun, M.; Jiang, Z. A DNA genetic algorithm inspired by biological membrane structure. J. Comput. Theor. Nanosci. 2016, 13, 3763–3772. [Google Scholar] [CrossRef]
Zadeh, L.A. Fuzzy sets. Inf. Control 1965, 8, 338–353. [Google Scholar] [CrossRef]
Ahmed, M.N.; Yamany, S.M.; Mohamed, N.; Farag, A.A.; Moriarty, T. A modified fuzzy C-means algorithm for bias field estimation and segmentation of MRI data. IEEE Trans. Med. Imaging 2002, 21, 193–199. [Google Scholar] [CrossRef] [PubMed]
Chen, S.C.; Zhang, D.Q. Robust image segmentation using FCM with spatial constraints based on new kernel-induced distance measure. IEEE Trans. Syst. Man Cybern. 2004, 34, 1907–1916. [Google Scholar] [CrossRef]
Elazab, A.; Wang, C.; Jia, F.; Wu, J.; Li, G.; Hu, Q. Segmentation of brain tissues from magnetic resonance images using adaptively regularized kernel-based fuzzy-means clustering. Comput. Math. Method Med. 2015, 2015. [Google Scholar] [CrossRef] [PubMed]
Gong, M.G.; Liang, Y.; Shi, J.; Ma, W.P.; Ma, J.J. Fuzzy C-Means Clustering With Local Information and Kernel Metric for Image Segmentation. IEEE Trans. Image Process. 2013, 22, 573–584. [Google Scholar] [CrossRef] [PubMed]
Pillonetto, G.; Dinuzzo, F.; Chen, T.S.; De Nicolao, G.; Ljung, L. Kernel methods in system identification, machine learning and function estimation: A survey. Automatica 2014, 50, 657–682. [Google Scholar] [CrossRef]
Muller, K.R.; Mika, S.; Ratsch, G.; Tsuda, K.; Scholkopf, B. An introduction to kernel-based learning algorithms. IEEE Trans. Neural Networ. 2001, 12, 181–201. [Google Scholar] [CrossRef] [PubMed]
Graves, D.; Pedrycz, W. Kernel-based fuzzy clustering and fuzzy clustering: A comparative experimental study. Fuzzy Sets Syst. 2010, 161, 522–543. [Google Scholar] [CrossRef]
Sun, X. Bioinformatics. Available online: http://www.lmbe.seu.edu.cn/chenyuan/xsun/bioinfomatics/web/CharpterFive/5.4.htm (accessed on 27 October 2017).
Zhang, L.; Wang, N. A modified DNA genetic algorithm for parameter estimation of the 2-Chlorophenol oxidation in supercritical water. Appl. Math. Model. 2013, 37, 1137–1146. [Google Scholar] [CrossRef]
Neuhauser, C.; Krone, S.M. The genealogy of samples in models with selection. Genetics 1997, 145, 519–534. [Google Scholar] [PubMed]
Fischer, M.; Hock, M.; Paschke, M. Low genetic variation reduces cross-compatibility and offspring fitness in populations of a narrow endemic plant with a self-incompatibility system. Conserv. Genet. 2003, 4, 325–336. [Google Scholar] [CrossRef]
Watson, J.D.; Crick, F.H. A structure for deoxyribose nucleic acid. Nature 1953, 171, 737–738. [Google Scholar] [CrossRef] [PubMed]
Asuncion, A.; Newman, D.J. UCI Machine Learning Repository; University of California: Irvine, CA, USA, 2007. [Google Scholar]
Gousias, I.S.; Edwards, A.D.; Rutherford, M.A.; Counsell, S.J.; Hajnal, J.V.; Rueckert, D.; Hammers, A. Magnetic resonance imaging of the newborn brain: Manual segmentation of labelled atlases in term-born and preterm infants. Neuroimage 2012, 62, 1499–1509. [Google Scholar] [CrossRef] [PubMed]
Vovk, U.; Pernus, F.; Likar, B. A review of methods for correction of intensity inhomogeneity in MRI. IEEE Trans. Med. Imaging 2007, 26, 405–421. [Google Scholar] [CrossRef] [PubMed]
Vinh, N.X.; Epps, J.; Bailey, J. Information Theoretic Measures for Clusterings Comparison: Variants, Properties, Normalization and Correction for Chance. J. Mach. Learn. Res. 2010, 11, 2837–2854. [Google Scholar]
Cocosco, C.A.; Kollokian, V.; Kwan, R.K.-S.; Pike, G.B.; Evans, A.C. Brainweb: Online Interface to a 3D MRI Simulated Brain Databas; CiteSeerX: State College, PA, USA, 1997. [Google Scholar]
He, L.L.; Greenshields, I.R. A Nonlocal Maximum Likelihood Estimation Method for Rician Noise Reduction in MR Images. IEEE Trans. Med. Imaging 2009, 28, 165–172. [Google Scholar] [PubMed]
Zhang, H.; Fritts, J.E.; Goldman, S.A. An entropy-based objective evaluation method for image segmentation. SPIE 2004, 5307, 38–49. [Google Scholar]

Figure 1. An example of LIV and

φ_{k}

in an image. (a) A Rician noisy image; (b) Grayscales in the

6 \times 6

white rectangle area from (a), with 3 neighborhood areas A, B, and C; (c) The LIV associated with each pixel calculated using Equation (8); (d) The variance

φ_{k}

associated with each pixel calculated using Equation (11).

Figure 1. An example of LIV and

φ_{k}

in an image. (a) A Rician noisy image; (b) Grayscales in the

6 \times 6

white rectangle area from (a), with 3 neighborhood areas A, B, and C; (c) The LIV associated with each pixel calculated using Equation (8); (d) The variance

φ_{k}

associated with each pixel calculated using Equation (11).

Figure 2. An example of an encoded individual.

Figure 3. An example of choose crossover operator.

Figure 4. Example of reconstruction operator.

Figure 5. Segmentation results on a T1-weighted axial slice from SBD with 7% noise and 20% grayscale non-uniformity. (a) Original image; (b) Ground truth; (c) GKFCM1; (d) GKFCM2; (e) FLICM; (f) KWFLICM; (g)MICO; (h) RSCFCM; (i) KIFCM1-DNAGA; (j) KIFCM2-DNAGA.

Figure 6. Segmentation results on a T1-weighted sagittal slice from SBD with 7% noise and 20% grayscale non-uniformity. (a) Original image; (b) Ground truth; (c) GKFCM1 results; (d) GKFCM2 results; (e) FLICM results; (f) KWFLICM results; (g) MICO results; (h) RSCFCM results; (i) KIFCM1-DNAGA results; (j) KIFCM2-DNAGA results.

Figure 7. Segmentation results on a T1-weighted axial slice (number 91) from SBD with 10% Rician noise. (a) Original image; (b) Ground truth; (c) GKFCM1; (d) GKFCM2; (e) FLICM; (f) KWFLICM; (g) MICO; (h) RSCFCM; (i) KIFCM1-DNAGA; (j) KIFCM2-DNAGA.

Figure 8. Segmentation results on the Brats1 image. (a) Original image; (b) GKFCM1; (c) GKFCM2; (d) FLICM; (e) KWFLICM; (f) MICO; (g) RSCFCM; (h) KIFCM1-DNAGA; (i) KIFCM2-DNAGA.

Figure 9. Segmentation results on the Brats2 image. (a) Original image; (b) GKFCM1; (c) GKFCM2. (d) FLICM; (e) KWFLICM; (f) MICO; (g) RSCFCM; (h) KIFCM1-DNAGA; (i) KIFCM2-DNAGA.

Figure 10. Segmentation results on the Brats3 image. (a) Original image; (b) GKFCM1; (c) GKFCM2; (d) FLICM; (e) KWFLICM; (f) MICO; (g) RSCFCM; (h) KIFCM1-DNAGA; (i) KIFCM2-DNAGA.

Figure 11. Segmentation results on the Brats4 image. (a) Original image; (b) GKFCM1; (c) GKFCM2; (d) FLICM; (e) KWFLICM; (f) MICO; (g) RSCFCM; (h) KIFCM1-DNAGA; (i) KIFCM2-DNAGA.

Figure 12. Average running times for experiments shown in subfigure a in Figure 5, Figure 6, Figure 7, Figure 8, Figure 9, Figure 10 and Figure 11.

Table 1. The detailed features of UCI datasets.

Datasets	Number of Instances	Number of Attributes	Number of Classification
Haberman’s Surviva1 Data	306	3	2
Contraceptive Method Choice	1473	9	3
Wisconsin Prognostic Breast Cancer	198	34	2
SPECT Heart Data	267	22	2

Table 2. Clustering results for UCI machine learning datasets.

Title	Method	Parameters	AMI	ARI
Haberman’s Surviva1 Data	GKFCM1	$c = 2, m = 2, σ = 0.5$	0.7263	0.6430
	GKFCM2	$c = 2, m = 2, σ = 0.5$	0.7221	0.6332
	FLICM	$c = 2, m = 2$	0.9509	0.8620
	KWFLICM	$c = 2, m = 2, σ = 0.5$	0.9386	0.8497
	MICO	$c = 2, m = 2$	0.7319	0.6430
	RSCFCM	$c = 2, m = 2.5$	0.9406	0.8587
	KIFCM1-DNAGA	$c = 2, m = 2.06, α = 11.03, σ = 50.03$	0.9704	0.8815
Contraceptive Method Choice	GKFCM1	$c = 3, m = 2, σ = 1.1$	0.6534	0.6544
	GKFCM2	$c = 3, m = 2, σ = 1.1$	0.6525	0.6525
	FLICM	$c = 3, m = 2$	0.7366	0.7366
	KWFLICM	$c = 3, m = 1.7, σ = 1.1$	0.7577	0.7577
	MICO	$c = 3, m = 2$	0.6477	0.6477
	RSCFCM	$c = 3, m = 2$	0.6853	0.7515
	KIFCM1-DNAGA	$c = 3, m = 2.45, α = 3.25, σ = 2.12$	0.7991	0.8021
Wisconsin Prognostic Breast Cancer	GKFCM1	$c = 2, m = 2, σ = 150$	0.9708	0.9687
	GKFCM2	$c = 2, m = 2, σ = 150$	0.9665	0.9644
	FLICM	$c = 2, m = 2$	0.9451	0.9430
	KWFLICM	$c = 2, m = 2, σ = 150$	0.9771	0.9750
	MICO	$c = 2, m = 2$	0.9808	0.9787
	RSCFCM	$c = 2, m = 2$	0.9832	0.9815
	KIFCM1-DNAGA	$c = 2, m = 3.13, α = 1.50, σ = 154.84$	0.9865	0.9844
SPECT Heart Data	GKFCM1	$c = 2, m = 2, σ = 150$	0.6365	0.7039
	GKFCM2	$c = 2, m = 2, σ = 150$	0.7264	0.7264
	FLICM	$c = 2, m = 2$	0.8350	0.8350
	KWFLICM	$c = 2, m = 2, σ = 150$	0.8915	0.8915
	MICO	$c = 2, m = 2$	0.7339	0.7339
	RSCFCM	$c = 2, m = 2$	0.8488	0.8500
	KIFCM1-DNAGA	$c = 2, m = 3.42, α = 6.04, σ = 156.2$	0.9174	0.9213

Table 3. JS and running time on the T1-weighted axial slice from SBD with 7% noise and 20% grayscale non-uniformity.

Algorithm	GKFCM1	GKFCM2	FLICM	KW FLICM	MICO	RSCFCM	KIFCM1-DNAGA	KIFCM2-DNAGA
WM	0.930	0.933	0.941	0.937	0.894	0.898	0.940	0.941
GM	0.822	0.855	0.860	0.852	0.782	0.842	0.864	0.868
CSF	0.781	0.847	0.829	0.805	0.860	0.882	0.863	0.867
Average	0.844	0.879	0.876	0.865	0.845	0.874	0.889	0.892
Time (s)	0.911	0.586	3.403	139.470	0.673	2.530	0.356	0.282

Table 4. JS and running time on the T1-weighted sagittal slice from SBD with 7% noise and 20% grayscale non-uniformity.

Algorithm	GKFCM1	GKFCM2	FLICM	KWFLICM	MICO	RSCFCM	KIFCM1-DNAGA	KIFCM2-DNAGA
WM	0.773	0.775	0.771	0.765	0.672	0.702	0.785	0.788
GM	0.796	0.806	0.794	0.791	0.669	0.743	0.815	0.816
CSF	0.834	0.852	0.835	0.824	0.869	0.813	0.871	0.872
Average	0.801	0.811	0.800	0.794	0.736	0.753	0.824	0.825
Time (s)	1.377	1.797	5.392	166.717	0.751	0.786	0.348	0.329

Table 5. JS and running time on the T1-weighted axial slice (number 91) from SBD corrupted with 10% Rician noise.

Algorithm	GKFCM1	GKFCM2	FLICM	KWFLICM	MICO	RSCFCM	KIFCM1-DNAGA	KIFCM2-DNAGA
WM	0.931	0.921	0.929	0.927	0.885	0.891	0.926	0.925
GM	0.804	0.827	0.831	0.821	0.777	0.803	0.838	0.840
CSF	0.826	0.872	0.861	0.846	0.889	0.876	0.887	0.892
Average	0.854	0.873	0.874	0.865	0.850	0.857	0.884	0.886
Time (s)	1.836	1.869	2.953	124.922	0.649	0.781	0.218	0.220

Table 6. Segmentation performance in 𝐸 measure and running times on the Brats1, Brats2, Brats3 and Brats4.

Image	Measure	GKFCM1	GKFCM2	FLICM	KWFLICM	MICO	RSCFCM	KIFCM1-DNAGA	KIFCM2-DNAGA
Brats1	E	1.311	1.288	1.392	1.424	1.309	1.314	1.274	1.270
Brats1	Time (s)	2.591	1.516	9.510	6.354	1.747	12.622	1.185	1.111
Brats2	E	1.307	1.297	1.336	1.340	1.298	1.307	1.273	1.271
Brats2	Time (s)	1.576	1.115	5.170	4.002	1.546	1.674	1.091	0.827
Brats3	E	1.295	1.285	1.324	1.328	1.286	1.295	1.261	1.259
Brats3	Time (s)	1.564	1.103	5.158	4.002	1.534	1.662	1.079	0.815
Brats4	E	1.425	1.415	1.454	1.458	1.416	1.424	1.391	1.389
Brats4	Time (s)	6.099	4.301	20.116	15.611	5.982	6.481	4.2081	3.1785

Table 7. Computational complexity.

Algorithm	GKFCM1	GKFCM2	FLICM	KWFLICM	MICO	RSCFCM	KIFCM1-DNAGA	KIFCM2-DNAGA
Overall complexity	$O (c n d^{2})$	$O (c n d^{2})$	$O (c n d)$	$O (c n d)$	$O (c n^{2} d)$	$O (c n d)$	$O (c n^{2} d)$	$O (c n^{2} d)$

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zang, W.; Zhang, W.; Zhang, W.; Liu, X. A Kernel-Based Intuitionistic Fuzzy C-Means Clustering Using a DNA Genetic Algorithm for Magnetic Resonance Image Segmentation. Entropy 2017, 19, 578. https://doi.org/10.3390/e19110578

AMA Style

Zang W, Zhang W, Zhang W, Liu X. A Kernel-Based Intuitionistic Fuzzy C-Means Clustering Using a DNA Genetic Algorithm for Magnetic Resonance Image Segmentation. Entropy. 2017; 19(11):578. https://doi.org/10.3390/e19110578

Chicago/Turabian Style

Zang, Wenke, Weining Zhang, Wenqian Zhang, and Xiyu Liu. 2017. "A Kernel-Based Intuitionistic Fuzzy C-Means Clustering Using a DNA Genetic Algorithm for Magnetic Resonance Image Segmentation" Entropy 19, no. 11: 578. https://doi.org/10.3390/e19110578

APA Style

Zang, W., Zhang, W., Zhang, W., & Liu, X. (2017). A Kernel-Based Intuitionistic Fuzzy C-Means Clustering Using a DNA Genetic Algorithm for Magnetic Resonance Image Segmentation. Entropy, 19(11), 578. https://doi.org/10.3390/e19110578

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Kernel-Based Intuitionistic Fuzzy C-Means Clustering Using a DNA Genetic Algorithm for Magnetic Resonance Image Segmentation

Abstract

1. Introduction

2. Preliminaries

2.1. Intuitionistic Fuzzy Sets (IFSs)

2.2. Fuzzy C-Means

2.3. DNA Genetic Algorithm

3. Related Work

4. Problem Formulation

4.1. Local Intensity Variance

4.2. Intuitionistic Fuzzy C-Means Clustering

4.3. Kernel Intuitionistic FCM (KIFCM)

5. The DNA Genetic Algorithm

5.1. DNA Encoding and Decoding

5.2. The Initialization

5.3. Selection Operator

5.4. DNA Genetic Operators

5.4.1. Crossover Operator

5.4.2. Mutation Operator

5.4.3. Reconstruction Operator

6. Experiments and Results

6.1. Experiment Setup

6.2. Results on UCI Datasets

6.3. Results on Synthetic Brain MR Images

6.4. Results on Clinical Brain MR Images

7. Discussion

8. Conclusions

Acknowledgments

Author Contributions

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI