Fuzzy Clustering with Uninorm-Based Distance Measure

Kagan, Evgeny; Novoselsky, Alexander; Rybalov, Alexander

doi:10.3390/math13101661

Open AccessArticle

Fuzzy Clustering with Uninorm-Based Distance Measure

by

Evgeny Kagan

^1,*

,

Alexander Novoselsky

² and

Alexander Rybalov

³

¹

Department Industrial Engineering, Ariel University, Kiryat ha-Mada, Ariel 4070000, Israel

²

Independent Researcher, Tel Aviv 6158101, Israel

³

LAMBDA Lab, Tel Aviv University, Ramat Aviv, Tel Aviv 6997801, Israel

^*

Author to whom correspondence should be addressed.

Mathematics 2025, 13(10), 1661; https://doi.org/10.3390/math13101661

Submission received: 15 April 2025 / Revised: 10 May 2025 / Accepted: 16 May 2025 / Published: 19 May 2025

(This article belongs to the Special Issue Advances in Multi-Criteria Decision Making Methods with Applications)

Download

Browse Figures

Versions Notes

Abstract

In this paper, we suggest an algorithm of fuzzy clustering with a uninorm-based distance measure. The algorithm follows a general scheme of fuzzy c-means (FCM) clustering, but in contrast to the existing algorithm, it implements logical distance between data instances. The centers of the clusters calculated by the algorithm are less dispersed and are concentrated in the areas of the actual centers of the clusters that result in the more accurate recognition of the number of clusters and of data structure.

Keywords:

fuzzy c-means clustering; uninorm; absorbing norm

MSC:

62A86; 62H30; 91C20

1. Introduction

In general, decision making starts with the formulation of possible alternatives, collecting required data, data analysis, the formulation of the decision criteria, and the choice of the alternatives based on the formulated criteria [1].

In simple cases of decision making with certain information and univariate data, the decision-making process is reduced to the choice of alternatives which minimize the payoff or maximize the reward. In cases of uncertain information, decision making is more complicated. It deals with the expected payoffs and rewards using probabilistic or other methods for handling uncertainty. Finally, in cases of multivariate data and multicriteria choices, decision making is hardly solvable by exact methods and can be considered a kind of art.

For example, in his book, Triantaphyllou [2] describes several methods of multicriteria decision making without suggesting any preferred technique. He indicates that each specific decision-making problem requires a specific method for its solution. Such an approach is supported by the authors of the book [3]. They stress “a growing need for methodologies that, on the basis of the ever-increasing wealth of data available, are able to take into account all the relevant points of view, technically called criteria, and all the actors involved” [3] (p. ix). Then, the authors present the classification of the methods for solving different decision-making problems.

To apply the methods of decision making, the raw data are preprocessed with the aim of recognizing possible patterns, and consequently, decreasing the number of instances, and, if possible, the number of dimensions [4]. The basic preprocessing procedure which decreases the number of instances used in the next steps of decision making is data classification [5,6], or, if any additional information is unavailable, data clustering [7,8].

The popular clustering method is the k-means algorithm, independently suggested by Lloyd [9] and by Forgy [10], and is known as the Lloyd–Forgy algorithm. This algorithm obtains

n

instances and creates

k

clusters, such that each instance is included into a single cluster for which Euclidean distance between the instance and the cluster’s center is minimal. The clusters are formed from the instances for which the within-cluster variances are also minimal. The FORTRAN implementation of the algorithm was published by Hartigan and Wong [11]. Currently, this algorithm is implemented in most statistical and mathematical software tools; for example, in MATLAB^®, it is implemented in the Statistics and Machine Learning Toolbox™ in the function kmeans [12].

The main disadvantage of the k-means algorithm and its direct successors is an inclusion of each instance only in a single cluster that often restricts the recognition of the data patterns and the interpretation of the obtained clusters.

To overcome this disadvantage, Bezdek [13,14] suggested the fuzzy clustering algorithm, which is widely known as the fuzzy c-means (FCM) algorithm. Following this algorithm, the clusters are considered fuzzy sets, and each instance is included in several clusters with certain degrees of membership. In its original version, the algorithm uses Euclidean distances between the instances and the degrees of membership are calculated based on these distances. The other versions of the algorithm follow its original structure but use the other distance measures. For example, the Fuzzy Logic Toolbox™ of MATLAB^® includes the function fcm [15], which implements the c-means algorithm with three distance measures: Euclidean distance, a distance measure based on the Mahalanobis distance from the instances and the cluster centers [16], and the exponential distance measure normalized by the probability of choosing the cluster [17]. For the overview of different versions of the c-means algorithm and the other fuzzy clustering methods, see the paper [18] and the book [19].

Along with the advantages of the fuzzy c-means clustering techniques, it inherits the following disadvantage of the k-means algorithm: both algorithms search for the predefined number of clusters and separate the cluster centers, even in cases where such separation does not follow from the data structures.

For example, if the data are a set of normally distributed instances, then it is expected that the clustering algorithm will recognize a single cluster, and the centers of all clusters will be placed close to the center of distribution. However, the c-means algorithm with the known distance measures does not provide such a result.

The suggested c-means algorithm with the uninorm-based distance solves this problem. Similarly to the other fuzzy c-means algorithms, the suggested algorithm has the same structure as original Bezdek algorithm; however, in contrast to the known methods, it considers the normalized instances as the truth values of certain propositions and implements fuzzy logical distance between them. The suggested distance is based on the probability-based uninorm and absorbing norm [20], which already demonstrated their usefulness in a wide range of decision-making and control tasks [21].

The suggested algorithm can be used for the recognition of the patterns in raw data and for data preprocessing in different decision-making problems.

The rest of the paper is organized as follows. In Section 2.1 we briefly outline the Bezdek fuzzy c-means algorithm, and in Section 2.2, we describe the uninorm and absorbing norm which will be used for the construction of the logical distance. Section 3 presents the main results: the distance measure based on the uninorm and absorbing norm (Section 3.1) and the resulting algorithm for fuzzy clustering (Section 3.2). In Section 3.3, we illustrate the activity of the algorithm by numerical simulations with different data distributions and compare the obtained results with the results provided by the known fuzzy c-means algorithms. Section 4 includes some discussable issues.

2. Materials and Methods

The suggested algorithm follows the structure of the Bezdek fuzzy c-means algorithm [13,14]; however, in contrast to the existing algorithms, it uses the logical distance based on the uninorm and absorbing norm [22,23]. Below, we briefly describe this algorithm and the used norms.

2.1. Fuzzy c-Means Algorithm

Assume that the raw data are given by the

d

-dimensional vectors of real numbers

X = (x_{1}, x_{2}, \dots, x_{n}) \subset R^{d},

where

x_{i} = (x_{i 1}, x_{i 2}, \dots, x_{i d})

,

i = 1, 2, \dots, n

,

n \geq 1

,

d \geq 1

stands for the observation or instance of the data.

Given a number

m \leq n

, denote by

Y = (y_{1}, y_{2}, \dots, y_{m}) \subset R^{d}

a vector of the cluster centers, where

y_{j} = (y_{j 1}, y_{j 2}, \dots, y_{j d})

,

j = 1, 2, \dots, m

,

m \geq 1

,

d \geq 1

, denotes the center of the

j

th cluster.

The problem is to define the centers

y_{j}

of the clusters and to define the membership degree

μ_{i j} \in [0, 1]

of each instance

x_{i}

to each cluster

j

.

The fuzzy c-means algorithm which solves this problem is outlined below in Algorithm 1 [13] (see also [14,15,17]).

Algorithm 1. Fuzzy c-means algorithm.

Input:

d

-dimensional data vectors

X = (x_{1}, x_{2}, \dots, x_{n})

,

n \geq 1

,
number of clusters

m

,
termination criterion

ε \in (0, 1)

,
weighting exponent

q > 1

,
function

{d i s t}^{d}

.
Output: vector

Y = (y_{1}, y_{2}, \dots, y_{m})

,

m \geq 1

, of cluster centers,
matrix

M = (μ_{i j})

,

i = 1, 2, \dots, n

,

j = 1, 2, \dots, m

, of membership degrees.

Initialize the cluster centers $y_{j}$ , $j = 1, 2, \dots, m$ , by random.
Initialize the membership degrees $μ_{i j}$ , $i = 1, 2, \dots, n$ , $j = 1, 2, \dots, m$ ,

$μ_{i j} = {(\frac{1}{{d i s t}^{d} (x_{i}, y_{j})})}^{1 / (q - 1)} / \sum_{k = 1}^{m} {(\frac{1}{{d i s t}^{d} (x_{i}, y_{k})})}^{1 / (q - 1)} .$
Do
Save membership degrees $μ_{i j}^{'} = μ_{i j}$ , $i = 1, 2, \dots, n$ , $j = 1, 2, \dots, m$ ,
Calculate cluster centers $y_{j} = \sum_{i = 1}^{n} {(μ_{i j})}^{q} x_{i} / \sum_{i = 1}^{n} {(μ_{i j})}^{q}$ ,
Calculate membership degrees

$μ_{i j} = {(\frac{1}{{d i s t}^{d} (x_{i}, y_{j})})}^{1 / (q - 1)} / \sum_{k = 1}^{m} {(\frac{1}{{d i s t}^{d} (x_{i}, y_{k})})}^{1 / (q - 1)},$

${w h e r e d i s t}^{d} (x_{i}, y_{j})$ is the distance between the instance $x_{i}$ and the cluster center $y_{j}$ ,
While $\max_{i, j} (μ_{i j}^{'} - μ_{i j}) > ε$ .
Return $Y = (y_{1}, y_{2}, \dots, y_{m})$ and $M = (μ_{i j})$ .

In the original version [13], the fuzzy c-means algorithm uses the Euclidean distance

{d i s t}^{d} (x_{i}, y_{j}) = \sqrt{\sum_{l = 1}^{d} {(x_{i l} - y_{j l})}^{2}},

and in the succeeding versions [16,17], and the other distance measures were applied.

In this paper, we suggest using the logical distance measure based on the uninorm and absorbing norm which results in the more accurate recognition of the cluster centers.

2.2. Uninorm and Absorbing Norm

Uninorm [22] and absorbing norm [23] are the operators of fuzzy logic which extend Boolean operators and are used for aggregating truth values.

Uninorm aggregator

\oplus_{θ}

with the neutral element

θ \in [0, 1]

is a function

⨁_{θ} : [0, 1] \times [0, 1] \to [0, 1]

satisfying the following properties [22];

x, y, z \in [0, 1]

:

a. Commutativity:

x \oplus_{θ} y = y \oplus_{θ} x

,

b. Associativity:

(x \oplus_{θ} y) \oplus_{θ} z = x \oplus_{θ} (y \oplus_{θ} z)

,

c. Monotonicity:

x \leq y

implies

x \oplus_{θ} z \leq y \oplus_{θ} z

,

d. Identity:

θ \oplus_{θ} x = x

for some

θ \in [0, 1]

,

and such that, if $θ = 1$ , then $x \oplus_{1} y = x \land y$ is a conjunction operator and if $θ = 0$ , then $x \oplus_{0} y = x \lor y$ is a disjunction operator.

An absorbing norm aggregator

\otimes_{ϑ}

with absorbing element

ϑ \in [0, 1]

is a function

\otimes_{ϑ} : [0, 1] \times [0, 1] \to [0, 1]

satisfying the following properties [23];

x, y, z \in [0, 1]

:

a. Commutativity:

x \otimes_{ϑ} y = y \otimes_{ϑ} x

,

b. Associativity:

(x \otimes_{ϑ} y) \otimes_{ϑ} z = x \otimes_{ϑ} (y \otimes_{ϑ} z)

,

c. Absorbing element:

ϑ \otimes_{ϑ} x = ϑ

for any

x \in [0, 1]

.

Operator $\otimes_{ϑ}$ is a fuzzy analog of the negated $x o r$ operator.

Let

u_{θ} : (0, 1) \to (- \infty, \infty)

and

v_{ϑ} : (0, 1) \to (- \infty, \infty)

be invertible, continuous, strictly monotonously increasing functions with the parameters

θ, ϑ \in [0, 1]

such that

a.

\lim_{x \to 0} u_{θ} (x) = - \infty

,

\lim_{x \to 0} v_{ϑ} (x) = - \infty

,

b.

\lim_{x \to 1} u_{θ} (x) = + \infty

,

\lim_{x \to 1} v_{ϑ} (x) = + \infty

,

c.

u_{θ} (θ) = θ

,

v_{ϑ} (ϑ) = ϑ

.

Then, for any $x, y \in (0, 1)$ hold [24]

$x \oplus_{θ} y = u_{θ}^{- 1} (u_{θ} (x) + u_{θ} (y)), x \otimes_{ϑ} y = v_{ϑ}^{- 1} (v_{ϑ} (x) \times v_{ϑ} (y)),$

and for the bounds $0$ and $1,$ the values of the operators $\oplus_{θ}$ and $\otimes_{ϑ}$ are defined in correspondence to the results of Boolean operators.

If

θ = ϑ

and

u \equiv v

, then the interval

[0, 1]

with the operators

\oplus_{θ}

and

\otimes_{ϑ}

form the algebra [20,25]

A = 〈[0, 1], \oplus_{θ}, \otimes_{ϑ}〉

in which uninorm

\oplus_{θ}

acts as a summation operator and absorbing norm

\otimes_{ϑ}

acts as a multiplication operator. In this algebra, neutral element

θ

has a value of zero for the summation and absorbing element

ϑ

is a unit value for multiplication. Note that, in contrast to the algebra of real numbers, in the algebra

A

,

θ = ϑ

holds despite the different meanings of these values.

The inverse operators in this algebra are

x ⊖_{θ} y = u_{θ}^{- 1} (u_{θ} (x) - u_{θ} (y)), x, y \in (0, 1) x ⊘_{ϑ} y = v_{ϑ}^{- 1} (v_{ϑ} (x) / v_{ϑ} (y)), v (y) \neq 0 .

In addition, the negation operator is defined as

⊖_{θ} x = u_{θ}^{- 1} (- u_{θ} (x)), x \in (0, 1) .

In the paper [20], it was demonstrated that the functions

u_{θ}

and

v_{ϑ}

satisfy the requirements of quantile functions of probability distributions. Along with that, any function satisfying the indicated above requirements and its inverse are also applicable.

Here, we will assume that the functions

u_{θ} \equiv v_{ϑ}

are equivalent. In the simulations, we will use the functions

u_{θ} (x) = \ln \frac{x^{α}}{1 - x^{α}}, x \in (0, 1),

u_{θ}^{- 1} (ξ) = {(\frac{e^{ξ}}{1 + e^{ξ}})}^{1 / α}, ξ \in (- \infty, \infty),

where

α = 1 / \log_{2} \frac{1}{θ}

,

θ \in (0, 1)

.

For other examples of the functions

u_{θ}

and

v_{ϑ}

, see the paper [20] and the book [21].

3. Results

We start with the definition of logical distance based on the uninorm and absorbing norm, and then present the suggested algorithm. In the last subsection we consider numerical simulations and compare the suggested algorithm with the known method.

3.1. Logical Distance Based on the Uninorm and Absorbing Norm

Let

x, y \in [0, 1]

be two truth values. Given neutral element

θ

and absorbing element

ϑ

, fuzzy logical dissimilarity of the values

x

and

y

is defined as follows:

{d i s i m}_{θ, ϑ} (x, y) = ⊖_{θ} ((x \oplus_{y} θ) \otimes_{ϑ} (θ \oplus_{x} y)) .

where the value

{s i m}_{θ, ϑ} (x, y) = (x \oplus_{y} θ) \otimes_{ϑ} (θ \oplus_{x} y)

is a fuzzy logical similarity of the values

x

and

y

.

If

θ = ϑ

, we will write

{d i s i m}_{θ}

instead of

{d i s i m}_{θ, ϑ}

and

{s i m}_{θ}

instead of

{s i m}_{θ, ϑ}

.

Lemma 1.

If

θ = ϑ

, then the function

{d i s i m}_{θ}

is a semi-metric in the algebra

A

.

Proof.

To prove the lemma, we need to check the three following properties:

${d i s i m}_{θ, ϑ} (x, x) = θ = ϑ$ .
By the commutativity and identity properties of the uninorm holds

$x \oplus_{x} θ = θ \oplus_{x} x = θ .$

Then, by the property of absorbing element holds

$θ \otimes_{θ} θ = θ$

and

$⊖_{θ} θ = θ .$
${d i s i m}_{θ, ϑ} (x, y) > θ = ϑ$ if $x \neq y$ .
If $x \neq y$ , then either

$x \oplus_{y} θ > θ and θ \oplus_{x} y < θ$

or

$x \oplus_{y} θ < θ and θ \oplus_{x} y > θ .$

Thus,

$(x \oplus_{y} θ) \otimes_{θ} (θ \oplus_{x} y) < θ$

and

$⊖_{θ} ((x \oplus_{y} θ) \otimes_{θ} (θ \oplus_{x} y)) > θ .$
${d i s i m}_{θ, ϑ} (x, y) = {d i s i m}_{θ, ϑ} (y, x)$ .

It follows directly from the commutativity of the absorbing norm.

Lemma is proven. □

Using the dissimilarity function, we define a fuzzy logical distance

{d i s t}_{θ, ϑ}

between the values

x, y \in [0, 1]

as follows (

θ = ϑ

):

{d i s t}_{θ, ϑ} (x, y) = \frac{1}{θ} {d i s i m}_{θ, ϑ} (x, y) - 1 .

Lemma 2.

If

θ = ϑ

, then the function

{d i s t}_{θ}

is a semi-metric in the algebra of real numbers on the interval

[0, 1]

.

Proof.

Since, for any

x, y \in [0, 1]

and any

θ = ϑ \in [0, 1]

both

x \oplus_{θ} y \in [0, 1]

,

x \otimes_{ϑ} y \in [0, 1]

and

⊖_{θ} x \in [0, 1]

, and by the properties a and b of semi-metric, holds

{d i s i m}_{θ, ϑ} (x, y) \in [θ, 1] .

If $x = y$ , then ${d i s i m}_{θ, ϑ} (x, y) = θ = ϑ$ and

${d i s t}_{θ, ϑ} (x, y) = 0 .$
If $x \neq y$ , then ${d i s i m}_{θ, ϑ} (x, y) > θ = ϑ$ . So $\frac{1}{θ} {d i s i m}_{θ, ϑ} (x, y) > 1$ and

${d i s t}_{θ, ϑ} (x, y) > 0 .$
Symmetry

{d i s t}_{θ, ϑ} (x, y) = {d i s t}_{θ, ϑ} (y, x)

follows directly from the symmetry of the dissimilarity

{d i s i m}_{θ, ϑ}

. □

An example of the fuzzy logic distance

{d i s t}_{θ, ϑ} (x, y)

between the values

x, y \in [0, 1]

with

θ = ϑ = 0.5

is shown in Figure 1a. For comparison, Figure 1b shows the Euclidean distance between the values

x, y \in [0, 1]

.

It is seen that the fuzzy logical distance better separates the close values and is less sensitive to the far values than Euclidean distance.

Now, let us extend the introduced fuzzy logical distance to multidimensional variables.

Let

(x_{1}, x_{2}, \dots, x_{n}) \subset {[0, 1]}^{d}, n \geq 1, d \geq 1,

be

d

-dimensional vectors such that each vector

x_{i} = (x_{i 1}, x_{i 2}, \dots, x_{i d})

,

i = 1, 2, \dots, n

, is a point in a

d

-dimensional space.

The fuzzy logical dissimilarity of the points

x_{i}

and

x_{j}

,

i, j = 1, 2, \dots, n

, is defined as follows:

{d i s i m}_{θ, ϑ}^{d} (x_{i}, x_{j}) = ⊖_{θ} \{{\oplus_{θ}}_{l = 1}^{d} [(x_{i l} \oplus_{x_{j l}} θ) \otimes_{ϑ} (θ \oplus_{x_{i l}} x_{j l})]\},

where, as above, the value

{s i m}_{θ, ϑ}^{d} (x_{i}, x_{j}) = {\oplus_{θ}}_{l = 1}^{d} [(x_{i l} \oplus_{x_{j l}} θ) \otimes_{ϑ} (θ \oplus_{x_{i l}} x_{j l})]

is the fuzzy logical similarity of the points

x_{i}

and

x_{j}

.

Let

θ = ϑ

. Then, as above, the fuzzy logical distance between the points

x_{i}

and

x_{j}

,

i, j = 1, 2, \dots, n

, is

{d i s t}_{θ, ϑ}^{d} (x_{i}, x_{j}) = \frac{1}{θ} {d i s i m}_{θ, ϑ}^{d} (x_{i}, x_{j}) - 1 .

Lemma 3.

If

θ = ϑ

, then the function

{d i s i m}_{θ}^{d}

is a semi-metric in the algebra

A = 〈{[0, 1]}^{d}, \oplus_{θ}, \otimes_{ϑ}〉

,

d \geq 1

.

Proof.

This statement is a direct consequence of Lemma 1 and the properties of the uninorm.

${d i s i m}_{θ, ϑ}^{d} (x_{i}, x_{j}) = θ = ϑ$ .
If $x_{i} = x_{j}$ , then for each $l = 1,2, \dots, d$

$(x_{i l} \oplus_{x_{j l}} θ) \otimes_{θ} (θ \oplus_{x_{i l}} x_{j l}) = θ .$

Then,

${\oplus_{θ}}_{l = 1}^{d} [(x_{i l} \oplus_{x_{j l}} θ) \otimes_{ϑ} (θ \oplus_{x_{i l}} x_{j l})] = {\oplus_{θ}}_{l = 1}^{d} θ = θ$

and finally

$⊖_{θ} θ = θ .$
${d i s i m}_{θ, ϑ}^{d} (x_{i}, x_{j}) > θ$ if $x \neq y$ .
If $x_{i} \neq x_{j}$ , then for each $l = 1,2, \dots, d$

$(x_{i l} \oplus_{x_{j l}} θ) \otimes_{θ} (θ \oplus_{x_{i l}} x_{j l}) < θ$

Then,

${\oplus_{θ}}_{l = 1}^{d} [(x_{i l} \oplus_{x_{j l}} θ) \otimes_{ϑ} (θ \oplus_{x_{i l}} x_{j l})] < θ$

and

$⊖_{θ} \{{\oplus_{θ}}_{l = 1}^{d} [(x_{i l} \oplus_{x_{j l}} θ) \otimes_{ϑ} (θ \oplus_{x_{i l}} x_{j l})]\} > θ .$
${d i s i m}_{θ, ϑ}^{d} (x_{i}, x_{j}) = {d i s i m}_{θ, ϑ}^{d} (x_{j}, x_{i})$ .

It follows directly from the symmetry of dissimilarity for each

l = 1,2, \dots, d

and the commutativity of the uninorm. □

Lemma 4.

If

θ = ϑ

, then the function

{d i s t}_{θ, ϑ}^{d}

is a semi-metric on the hypercube

{[0, 1]}^{d}

,

d \geq 1

.

Proof.

The proof is literally the same as the proof of Lemma 2. □

The suggested algorithm uses the introduced function

{d i s t}_{θ, ϑ}^{d}

as a distance measure between the instances of the data.

3.2. The c-Means Algorithm with Fuzzy Logical Distance

The suggested algorithm considers the instances of data as truth values and uses Algorithm 1 with fuzzy logical distance

{d i s t}_{θ, ϑ}^{d}

on these values.

As above, assume that the raw data are represented by

d

-dimensional vectors

X = (x_{1}, x_{2}, \dots, x_{n}) \subset R^{d},

where

x_{i} = (x_{i 1}, x_{i 2}, \dots, x_{i d})

,

i = 1, 2, \dots, n

,

n \geq 1

,

d \geq 1

, is a data instance.

Since function

{d i s t}_{θ, ϑ}^{d}

requires the values from the hypercube

{[0, 1]}^{d}

,

d \geq 1

, vector

X

must be normalized

\tilde{X} = norm (X), X \subset R^{d}, \tilde{X} \subset {[0, 1]}^{d},

and the algorithm should be applied to the normalized data vector

\tilde{X}

. After the definition of the cluster centers

\tilde{Y}

, the inverse normalization must be applied to the vector

\tilde{Y}

Y = {norm}^{- 1} (\tilde{Y}), \tilde{Y} \subset {[0, 1]}^{d}, Y \subset R^{d} .

Normalization can be conducted by several methods; in Appendix A, we present a simple Algorithm A1 of normalization by linear transformation also called the min–max scaling. The inverse transformation provided by Algorithm A2 is also presented in Appendix A. These transformations are the simplest ones for the normalization of the raw data. Note that with respect to the task, other normalization methods can be applied.

In general, the suggested Algorithm 2 follows the Bezdek fuzzy c-means Algorithm 1, but differs in the distance function and in initialization of cluster centers

{\tilde{y}}_{j}

,

j = 1, 2, \dots, m

, and, consequently, in the definition of the number of clusters.

Algorithm 2. Fuzzy c-means algorithm with fuzzy logical distance measure

Input:

d

-dimensional data vectors

X = (x_{1}, x_{2}, \dots, x_{n}) \subset R^{d}

,

n \geq 1

,

d \geq 1

,
termination criterion

ε \in (0, 1)

,
weighting exponent

q > 1

,
precision

ξ \in (0, 1)

,
distance function

{d i s t}_{θ, ϑ}^{d}

with the parameters

θ, ϑ \in [0, 1]

.
Output: vector

Y = (y_{1}, y_{2}, \dots, y_{m})

,

m \geq 1

, of cluster centers,
matrix

M = (μ_{i j})

,

i = 1, 2, \dots, n

,

j = 1, 2, \dots, m

, of membership degrees.

Calculate normalized data vectors $\tilde{X} = ({\tilde{x}}_{1}, {\tilde{x}}_{2}, \dots, {\tilde{x}}_{n}) \subset {[0, 1]}^{d}$ and the values $x_{m i n}$ , ${\tilde{x}}_{m a x}$ using Algorithm A1.
Initialize vector $\tilde{Y} = ({\tilde{y}}_{1}, {\tilde{y}}_{2}, \dots, {\tilde{y}}_{m}) \subset {[0, 1]}^{d}$ of cluster centers as a $d$ -dimensional grid with the step $ξ$ in the hypercube ${[0, 1]}^{d}$ . The number $m$ of clusters is defined as a number of nodes in this grid.
Initialize the membership degrees $μ_{i j}$ , $i = 1, 2, \dots, n$ , $j = 1, 2, \dots, m$ ,

$μ_{i j} = {(\frac{1}{{d i s t}_{θ, ϑ}^{d} ({\tilde{x}}_{i}, {\tilde{y}}_{j})})}^{1 / (q - 1)} / \sum_{k = 1}^{m} {(\frac{1}{{d i s t}_{θ, ϑ}^{d} ({\tilde{x}}_{i}, {\tilde{y}}_{k})})}^{1 / (q - 1)} .$
Do
Save membership degrees $μ_{i j}^{'} = μ_{i j}$ , $i = 1, 2, \dots, n$ , $j = 1, 2, \dots, m$ ,
Calculate cluster centers

${\tilde{y}}_{j} = \sum_{i = 1}^{n} {(μ_{i j})}^{q} {\tilde{x}}_{i} / \sum_{i = 1}^{n} {(μ_{i j})}^{q},$
Calculate membership degrees

$μ_{i j} = {(\frac{1}{{d i s t}_{θ, ϑ}^{d} ({\tilde{x}}_{i}, {\tilde{y}}_{j})})}^{1 / (q - 1)} / \sum_{k = 1}^{m} {(\frac{1}{{d i s t}_{θ, ϑ}^{d} ({\tilde{x}}_{i}, {\tilde{y}}_{k})})}^{1 / (q - 1)},$

where ${d i s t}_{θ, ϑ}^{d} ({\tilde{x}}_{i}, {\tilde{y}}_{j})$ is the fuzzy logical distance between the instance ${\tilde{x}}_{i}$ and the cluster center ${\tilde{y}}_{j}$ ,
While $\max_{i, j} ({\tilde{μ}}_{i j} - μ_{i j}) > ε$ .
Calculate renormalized vector $Y = (y_{1}, y_{2}, \dots, y_{m}) \subset R^{d}$ from the vector $\tilde{Y} = ({\tilde{y}}_{1}, {\tilde{y}}_{2}, \dots, {\tilde{y}}_{m}) \subset {[0, 1]}^{d}$ using Algorithm A2.
Return $Y = (y_{1}, y_{2}, \dots, y_{m})$ and $M = (μ_{i j})$ .

The main difference between the suggested Algorithm 2 and the original Bezdek fuzzy c-means Algorithm 1 [13,14] and the known Gustafson–Kessel [16] and Gath–Geva [17] is the use of the fuzzy logical distance

{d i s t}_{θ, ϑ}^{d}

. The use of this distance requires the normalization and renormalization of the data.

Another difference is in the need to initialize the cluster centers as a grid in the algorithm’s domain. Such initialization is required because of the quick convergence of the algorithm; thus, the regular distribution of initial cluster centers avoids missing clusters. A simple Algorithm A3 for creating a grid in the square

{[0, 1]}^{2}

is outlined in Appendix A.

Let us consider the two main properties of the suggested algorithm.

Theorem 1.

Algorithm 2 converges.

Proof.

The convergence of Algorithm 2 follows directly from the fact that

{d i s t}_{θ, ϑ}^{d}

is a semi-metric (see Lemma 4).

In fact, in the lines 3 and 7 of the algorithm holds

0 < μ_{i j} \leq 1,

{d i s t}_{θ, ϑ}^{d} (x_{i}, x_{j}) < {d i s t}_{θ, ϑ}^{d} (x_{p}, x_{q}) \Rightarrow μ_{i j} < μ_{p q},

and the algorithm converges. □

Theorem 2.

The time complexity of Algorithm 2 is

O (d m n k)

, where

d

is the dimensionality of the data,

m

is the number of clusters,

n

is the number of instances in the data, and

k

is the number of iterations.

Proof.

At first, consider the calculation of the fuzzy logical distances

{d i s i m}_{θ, ϑ}^{d} (x_{i}, y_{j})

,

i = 1, 2, \dots, n

,

j = 1, 2, \dots, m

. The complexity of this operation is

O (1)

for each dimension

l = 1, 2, \dots, d

; thus, the calculation of the distances has a complexity

O (d)

.

Now, let us consider the lines of the algorithm. The normalization of the data vector (line 1) requires

O (d n)

steps and the initialization of the cluster centers (line 2) requires

O (m n)

steps (see Algorithm A1 in Appendix A).

Initialization of the membership degrees (line 3) requires

O (m n)

steps for each dimension that gives

O (d m n)

.

In the do-while loop, saving membership degrees (line 5) requires

O (m n)

, the calculation of the cluster centers given the membership degrees (line 6) requires

O (m n)

steps, and the calculation of membership degrees (line 7) requires, as above,

O (m n)

steps for each dimension that gives

O (d m n)

.

Finally, the renormalization of the vector of the cluster centers (line 9) requires

O (d n)

steps.

Thus, initial (lines 1–3) and final (9) operations of the algorithm require

O (d n) + O (m n) + O (d m n) + O (d n) = O (d m n) .

steps and each iteration requires

O (m n) + O (m n) + O (m n) + O (d m n) = O (d m n) .

Then, for

k

iterations, it is required that

O (d m n) + k O (d m n) = O (d m n k)

steps. □

For experimental validation of the running time, the suggested Algorithm 2 was implemented using MATLAB^® R2024b and was trialed on the PC HP with the processor Intel^® Core™ i7-1255U 1.70 GHz and RAM 16.0 GB operated by OS Microsoft Windows 11.

In the trials, we checked the dependence of the runtime

t_{r u n}

on the number

m

of clusters for the datasets of different sizes

n

. The results of the trials are summarized in Table 1.

As was expected (see Theorem 2), the runtime

t_{r u n}

of the algorithm increases with the number

n

of instances and the number

m

of clusters, and it is easy to verify that the increase is linear.

Finally, note that, in the considerations above, we assumed that

θ = ϑ

which supported the semi-metric properties of the function

{d i s t}_{θ, ϑ}^{d}

. Along with that, in practice, these parameters can differ, and despite the absence of formal proofs, the use of the function

{d i s t}_{θ, ϑ}^{d}

with

θ ~ 1.2 ϑ

while

ϑ \leq 0.5

can provide better clustering. In the numerical simulations we used the values

θ = 0.475

and

ϑ = 0.4

, and since the data are normalized before clustering, these values do not depend on the range of the raw data.

The other parameter used in the suggested algorithm is precision

ξ \in (0, 1)

, which defines the step of the grid, and consequently—an initial number

m

of clusters. In practice, the value

ξ

should result in

m

which is greater than or equal to the expected number of clusters. If an expected number of clusters is unknown, then

ξ

should have a value such that the initial number

m

of clusters is greater than or equal to the number

n

of instances in the dataset. Note that the smaller

ξ

and consequently larger

m

lead to a longer computation time. The dependence of the results of the algorithm on the value

ξ

is illustrated below.

3.3. Numerical Simulations

In the first series of simulations, we will demonstrate that the suggested Algorithm 2 with fuzzy logical distance results in more precise centers of the clusters than the original Algorithm 1 with the Euclidean distance.

For this purpose, in the simulations, we generate a single cluster, as normally distributed instances with the known center, and apply Algorithms 1 and 2 to these data. As a measure of the quality of the algorithms, we use the mean squared error (MSE) in finding the center of the cluster center and the means of the standard deviations of the calculated clusters centers.

To avoid the influence of the instance values, in the simulations, we considered the normalized data. An example of the data (white circles) and the results of the algorithms (gray diamonds for Algorithm 1 and black pentagrams for Algorithm 2) are shown in Figure 2.

The simulations were conducted with different numbers of clusters by the series of

100

trials. The results were tested and compared by the one-sample and two-sample Student’s

t

-tests.

In the simulations, we assumed that the algorithm, which for a single cluster provides less dispersed cluster centers, is more precise in the calculation of the cluster centers. In other words, the algorithm, which results in the clusters centers concentrated near the actual cluster center, is better than the algorithm, which results in more dispersed cluster centers.

The results of the simulations are summarized in Table 2. In the table, we present the results of clustering of two-dimensional data distributed around the mean

μ = 0.5

.

It is seen that both algorithms result in cluster centers close to the actual cluster center and the errors in calculating the centers are extremely small. Additional statistical testing by the Student’s

t

-test demonstrated that the differences between the obtained clusters centers are not significant with

α = 0.95

.

This conclusion was additionally verified using the silhouette criterion [26] with the Euclidean distance measure, which was applied to the cluster centers obtained by Algorithms 1 and 2 with respect to the real cluster centers. The results of the validation with different numbers

n

of instances and different numbers

m

of clusters are summarized in Table 3.

It is seen that both algorithms result in very close means of the silhouette criterion and the difference between the values of this criterion are not significant. Hence, both algorithms result in cluster centers close to the real cluster centers with the same precision.

Along with that, from Table 2, it follows that the suggested Algorithm 2 results in a smaller standard deviation than the Algorithm 1 and this difference is significant with

α = 0.95

. Hence, the suggested Algorithm 2 results in more precise cluster centers than Algorithm 1.

These results are supported by the values of the Dann partition coefficient [27,28], which represents the degree of fuzziness of the clustering. The graph of the Dann coefficient with respect to the initial number of clusters is shown in Figure 3.

In the figure, the dataset includes ten real clusters. The algorithms started with a single cluster (

p r e c i s i o n ξ = 1

) and continued up to

100

clusters (precision

ξ = 0.1

). For each initial number of clusters, the trial included ten runs, and the plotted value is an average of the coefficients obtained in ten runs.

As expected, for the suggested Algorithm 2, the Dann coefficient is much lower than it is for Algorithm 1, except in the case of a single initial cluster center, for which these values are close. The reason for such behavior from the algorithms is the following: Algorithm 1 distributes the clusters centers over the instances, while the suggested Algorithm 2 concentrates the clusters centers close to the real centers of the clusters.

To illustrate these results, let us consider simulations of the algorithms on the data with several predefined clusters.

Consider the application of Algorithms 1 and 2 to the data with two predefined clusters with the centers in the points

(0.3, 0.3)

and

(0.7, 0.7)

. The resulting cluster centers with

m = 25

and

m = 9

clusters are shown in Figure 4. The notation in the figure is the same as in Figure 2.

It is seen that the clusters centers obtained by Algorithm 2 (black pentagrams) are concentrated closer to the actual cluster centers than the cluster centers obtained by Algorithm 1 (gray diamonds). Moreover, some of the cluster centers obtained by Algorithm 2 are located in the same points, while all cluster centers obtained by Algorithm 1 are located in different points.

More clearly, the effect is observed on the data with several clusters. Figure 5 shows the results of Algorithms 1 and 2 applied to the data with

10

predefined clusters.

It is seen that the cluster centers calculated by Algorithm 2 are concentrated at the real centers of the clusters and, as above, several centers are located at the same points. Hence, the suggested Algorithm 2 allows the more correct definition of the cluster centers and consequently, more correct clustering.

Now, let us apply Algorithms 1 and 2 to the well-known Iris dataset [29,30]. The dataset describes three types of Iris flowers using

150

instances (

50

for each type), and each instance includes four attributes—sepal length, sepal width, petal length, and petal width.

In the trials, we applied the algorithms to four pairs of attributes: sepal length and petal length, sepal length and petal width, sepal width and petal length, and sepal width and petal width. Results of Algorithms 1 and 2 are shown in Figure 6.

Similarly to the previous trials, the cluster centers found by the suggested Algorithm 2 are less dispersed than the cluster centers found by Algorithm 1. For large numbers of iterations (in the trials, we used

T = 500

), the suggested Algorithm 2, in contrast to Algorithm 1, recognizes the cluster centers in hardly separable data, but does not always result in unambiguous cluster centers. In Figure 6b,c, where one cluster is separate and two are hardly separable, Algorithm 2 correctly defined three clusters, and in Figure 6a,d, where also one cluster is separate and two are non-separable, in two non-separable clusters, the algorithm defined four centers.

Finally, let us illustrate the dependence of the results of the suggested algorithm on the precision

ξ

and, consequently, on the initial number

m

of the clusters. The results of algorithms 1 and 2 with

ξ = 0.1

,

m = 100

and

ξ = 0.05

,

m = 400

, are shown in Figure 7a and Figure 7b, respectively. Note that, in Figure 7a, the number of cluster centers is four times smaller than the number of instances,

m = \frac{1}{4} n

, and in Figure 7b the number of cluster centers is equal to the number of instances,

m = n

.

It is seen that the cluster centers defined by Algorithm 2 with the precision

ξ = 0.1

and

ξ = 0.05

are concentrated in the same regions. The further increase in the precision provides the same results but requires a longer computation time.

To illustrate the usefulness of the suggested algorithm in recognition of the number of clusters and the analysis of the data structure, let us compare the results of the algorithm with the results obtained by the MATLAB^® fcm function [15] with three possible distance measures: Euclidean distance, Mahalanobis distance [16], and exponential distance [17]. These algorithms were applied to

n = 200

data instances distributed around

5

centers; the obtained results are shown in Figure 8.

It is seen that the suggested algorithm (Figure 8a) correctly recognizes the real centers of the clusters and locates the cluster centers close to these real centers. In contrast, the known algorithms implemented in MATLAB^® do not recognize the real centers of the clusters, and consequently, do not define correct number of the clusters and locations of their centers. The function fcm with Euclidean and Mahalanobis distance measures (Figure 8b and Figure 8c) results in two cluster centers and the function fcm with exponential distance (Figure 8d) results in three cluster centers. Note that the use of the Euclidean and Mahalanobis distance measures leads to very similar results.

Thus, the recognition of real cluster centers can be conducted in two stages. At first, the cluster centers are defined by the application of the suggested algorithm to raw data, and secondly, the cluster centers are defined by the application of k-means to cluster centers found at the first stage. The resulting cluster centers indicate the real cluster centers.

4. Discussion

The suggested algorithm of fuzzy clustering follows the line of the c-means fuzzy clustering algorithms and differs from the known methods in the used distance measure.

The suggested fuzzy logical distance is based on the introduced semi-metric in the algebra of truth values with the uninorm and absorbing norm. The meaning of this semi-metric is the following.

Assume that some statements

A

and

B

are considered by a group of

d

observers and each of the observers expresses an opinion about the truthfulness of these statements. Assuming that the observers are independent, then statements

A

and

B

can be considered as the points in the

d

-dimensional space, and then, the suggested semi-metric is a distance between these statements.

In other words, the fuzzy logical distance measures allow comparing the statements based on their subjective truthiness.

In this paper, we considered the fuzzy logical distance based on the uninorm and absorbing norm which, in their turn, use the sigmoid generating functions. As a result, the fuzzy logical distance effectively separates the data instances, which leads to the more precise calculation of the cluster centers and the faster convergence of the algorithm.

An additional advantage of the algorithm is the possibility of tuning its activity by two parameters: neutral element

θ

and absorbing element

ϑ

. As indicated above, better clustering results can be obtained using non-equal values

θ = 0.475

and

ϑ = 0.4

. An inequality of these parameters slightly disturbs the semi-metric properties of the distance measures; however, this leads to the better separation of close but different points.

The weakness of the algorithm is the need for the normalization of raw data and then the renormalization of the obtained cluster centers. However, since both operations are conducted in polynomial time, this disadvantage is not a serious drawback.

The suggested algorithm of fuzzy clustering can be used for solving the clustering problems instead of or together with the known algorithms, for the recognition of the centers of distributions of the data instances, and form a basis for the development of the methods of comparison of multivariate samples.

Despite the absence of theoretical limitations to the dimension of the data, the current implementation of the algorithm is aimed at two-dimensional data because of the visualization purposes. Further implementation will concentrate on avoiding this limitation and on the optimization of the code.

In addition, in the further research, we will consider the suggested algorithm in the framework of the possibilistic c-means clustering [31,32] and will address the relations between the suggested distance measure and the methods used in the possibilistic-fuzzy [33,34] and fuzzy-possibilistic [32,35] algorithms.

Author Contributions

Conceptualization, E.K. and A.N.; methodology, A.R.; software, E.K. and A.N.; validation, E.K., A.N., and A.R.; formal analysis, E.K. and A.R.; investigation, E.K. and A.N.; resources, E.K. and A.N.; data curation, A.N.; writing—original draft preparation, E.K.; writing—review and editing, A.N. and A.R.; supervision, A.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

No new data were created or analyzed in this study.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Appendix includes three supplementary algorithms which are used in the suggested Algorithm 2.

Algorithm A1. Normalization by linear transformation.

Input : data vector X = (x_{1}, x_{2}, \dots, x_{n}) \subset R^{d}

, n \geq 1

, d \geq 1 .

Output : normalized data vector \tilde{X} = ({\tilde{x}}_{1}, {\tilde{x}}_{2}, \dots, {\tilde{x}}_{n}) \subset {[0, 1]}^{d}

, n \geq 1

, d \geq 1,

x_{m i n}

, {\tilde{x}}_{m a x} .

1.: Calculate minimal data value

$x_{m i n} = \min_{l} \{\min_{i} \{x_{i l}\}\}, i = 1, 2, \dots, n, l = 1, 2, \dots, d .$
2.: $For each l = 1, 2, \dots, d$ do
3.: For each $i = 1, 2, \dots, n$ do

${\tilde{x}}_{i l} = x_{i l} - x_{m i n}$
4.: End for
5.: End for.
6.: Calculate intermediate maximal value

${\tilde{x}}_{m a x} = \max_{l} \{\max_{i} \{{\tilde{x}}_{i l}\}\}, i = 1, 2, \dots, n, l = 1, 2, \dots, d .$
7.: $For each l = 1, 2, \dots, d$ do
8.: For each $i = 1, 2, \dots, n$ do

${\tilde{x}}_{i l} = {\tilde{x}}_{i l} / {\tilde{x}}_{m a x}$
9.: End for
10.: End for.
11.: $Return \tilde{X} = ({\tilde{x}}_{1}, {\tilde{x}}_{2}, \dots, {\tilde{x}}_{n})$ $, x_{m i n}$ $and {\tilde{x}}_{m a x}$ .

Algorithm A2. Inverse transform of the normalized vector.

Input:

normalized vector \tilde{X} = ({\tilde{x}}_{1}, {\tilde{x}}_{2}, \dots, {\tilde{x}}_{n}) \subset {[0, 1]}^{d}

, n \geq 1

, d \geq 1,

x_{m i n}

, {\tilde{x}}_{m a x}

Output: vector

X = (x_{1}, x_{2}, \dots, x_{n}) \subset R^{d}

, n \geq 1

, d \geq 1 .

$For each l = 1, 2, \dots, d$ do
For each $i = 1, 2, \dots, n$ do
$x_{i l} = {\tilde{x}}_{i l} \times {\tilde{x}}_{m a x} + x_{m i n}$
End for
End for.
$Return X = (x_{1}, x_{2}, \dots, x_{n})$ .

Algorithm A3. Creating a grid in the square

{[0, 1]}^{2}

Input : precision ξ \in [0, 1] .

Output : vector Y = (y_{1}, y_{2}, \dots, y_{m}) \subset {[0, 1]}^{2},

number m of clusters .

1.: $Set \tilde{m} = ⌈1 / ξ⌉ and m = {\tilde{m}}^{2} .$
2.: $If m = 1 then$
3.: $Set y_{11} = 1 / 2 and y_{21} = 1 / 2,$
4.: $Else$
5.: $For i = 1 to m do$
6.: $Set j = ⌈1 / \tilde{m}⌉ and k = (i - 1) m o d \tilde{m} + 1,$
7.: $Set y_{1 i} = j and y_{2 i} = k,$
8.: $End for .$
9.: $End if .$
10.: $Return vector Y = (y_{1}, y_{2}, \dots, y_{m}) and number m$ of clusters.

References

Raiffa, H. Decision Analysis. Introductory Lectures on Choices Under Uncertainty; Addison-Wesley: Reading, MA, USA, 1968. [Google Scholar]
Triantaphyllou, E. Multi-Criteria Decision Making Methods: A Comparative Study; Springer Science + Business Media: Dordrecht, The Netherlands, 2000. [Google Scholar]
López, L.M.; Ishizaka, A.; Qin, J.; Carrillo, P.A.A. Multi-Criteria Decision-Making Sorting. Methods Applications to Real-World Problems; Academic Press/Elsevier: London, UK, 2023. [Google Scholar]
Han, J.; Kamber, M.; Pei, J. Data Mining: Concepts and Techniques; Morgan Kaufmann/Elsevier: Waltham, MA, USA, 2012. [Google Scholar]
Ghanaiem, A.; Kagan, E.; Kumar, P.; Raviv, T.; Glynn, P.; Ben-Gal, I. Unsupervised classification under uncertainty: The distance-based algorithm. Mathematics 2023, 11, 4784. [Google Scholar] [CrossRef]
Ratner, N.; Kagan, E.; Kumar, P.; Ben-Gal, I. Unsupervised classification for uncertain varying responses: The Wisdom-In-the-Crowd (WICRO) algorithm. Knowl.-Based Syst. 2023, 272, 110551. [Google Scholar] [CrossRef]
Everitt, B.S.; Landau, S.; Leese, M.; Stahl, D. Cluster Analysis; Jonh Wiley & Sons: Chichester, UK, 2011. [Google Scholar]
Aggarwal, C.C. An introduction to cluster analysis. In Data Clustering: Algorithms and Applications; Aggarwal, C.C., Reddy, C.K., Eds.; Chapman & Hall/CRC/Taylor & Francis: London, UK, 2014. [Google Scholar]
Lloyd, S.P. Least squares quantization in PCM. IEEE Trans. Inf. Theory 1982, 28, 129–137. [Google Scholar] [CrossRef]
Forgy, E.W. Cluster analysis of multivariate data: Efficiency vs. interpretability of classifications. Biometrics 1965, 21, 768–769. [Google Scholar]
Hartigan, J.A.; Wong, M.A. A k-means clustering algorithm. J. R. Stat. Soc. Ser. C 1979, 28, 100–108. [Google Scholar]
MATLAB® Help Center. k-Means Clustering. Available online: https://www.mathworks.com/help/stats/k-means-clustering.html (accessed on 6 May 2025).
Bezdek, J.C. Fuzzy Mathematics in Pattern Classification. Ph.D. Thesis, Cornell University, Ithaca, NY, USA, 1973. [Google Scholar]
Bezdek, J.C. Pattern Recognition with Fuzzy Objective Function Algorithms; Springer: New York, NY, USA, 1981. [Google Scholar]
MATLAB® Help Center. Fuzzy Clustering. Available online: https://www.mathworks.com/help/fuzzy/fuzzy-clustering.html (accessed on 6 May 2025).
Gustafson, D.; Kessel, W. Fuzzy clustering with a fuzzy covariance matrix. In Proceedings of the IEEE Conference on Decision and Control, San Diego, CA, USA, 10–12 January 1979. [Google Scholar]
Gath, I.; Geva, A.B. Unsupervised optimal fuzzy clustering. IEEE Trans. Pattern Anal. Mach. Intell. 1989, 11, 773–780. [Google Scholar] [CrossRef]
Li, J.; Lewis, H.W. Fuzzy clustering algorithms—Review of the applications. In Proceedings of the IEEE International Conference on Smart Cloud, New York, NY, USA, 18–20 November 2016. [Google Scholar]
Höppner, F.; Klawonn, F.; Kruse, R.; Runkler, T. Fuzzy Cluster Analysis: Methods for Classification, Data Analysis and Image Recognition; Jonh Wiley & Sons: Chichester, UK, 2019. [Google Scholar]
Kagan, E.; Rybalov, A.; Siegelmann, H.; Yager, R. Probability-generated aggregators. Int. J. Intell. Syst. 2013, 28, 709–727. [Google Scholar] [CrossRef]
Kagan, E.; Rybalov, A.; Yager, R. Multi-Valued Logic for Decision-Making Under Uncertainty; Springer-Nature/Birkhäuser: Cham, Switzerland, 2025. [Google Scholar]
Yager, R.R.; Rybalov, A. Uninorm aggregation operators. Fuzzy Sets Syst. 1996, 80, 111–120. [Google Scholar] [CrossRef]
Rudas, I.J. New approach to information aggregation. Zb. Rad. 2000, 2, 163–176. [Google Scholar]
Fodor, J.; Yager, R.; Rybalov, A. Structure of uninorms. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 1997, 5, 411–427. [Google Scholar] [CrossRef]
Fodor, J.; Rudas, I.J.; Bede, B. Uninorms and absorbing norms with applications to image processing. In Proceedings of the 4th Serbian-Hungarian Joint Symposium on Intelligent Systems, Subotica, Serbia, 29–30 September 2006. [Google Scholar]
Rousseeuw, P.J. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 1987, 20, 53–65. [Google Scholar] [CrossRef]
Dunn, J.C. A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J. Cybern. 1973, 3, 32–57. [Google Scholar] [CrossRef]
Dunn, J.C. Well-separated clusters and optimal fuzzy partitions. J. Cybern. 1974, 4, 95–104. [Google Scholar] [CrossRef]
Fisher, R.A. The use of multiple measurements in taxonomic problems. Ann. Eugen. 1936, 7, 179–188. [Google Scholar] [CrossRef]
Iris Dataset. UC Irvine Machine Learning Depository. Available online: https://archive.ics.uci.edu/dataset/53/iris (accessed on 6 May 2025).
Krishnapuram, R.; Keller, J. The possibilistic c-means algorithm: Insights and recommendations. IEEE Trans. Fuzzy Syst. 1996, 4, 385–393. [Google Scholar] [CrossRef]
Pal, N.R.; Pal, K.; Bezdek, J.C. A mixed c-means clustering model. In Proceedings of the IEEE International Conference on Fuzzy Systems, Barcelona, Spain, 1–5 July 1997. [Google Scholar]
Pal, N.K.; Pal, K.; Keller, K.M.; Bezdek, J.C. A possibilistic fuzzy c-means clustering algorithm. IEEE Trans. Fuzzy Syst. 2005, 13, 517–530. [Google Scholar] [CrossRef]
Chen, Y.; Zhou, S. Revisiting possibilistic fuzzy c-means clustering using the majorization-minimization method. Entropy 2024, 26, 670. [Google Scholar] [CrossRef] [PubMed]
Naghi, M.-B.; Kovacs, L.; Szilagyi, L. A generalized fuzzy-possibilistic c-means clustering algorithm. Acta Univ. Sapientiae Inform. 2023, 15, 404–431. [Google Scholar] [CrossRef]

Figure 1. (a) Fuzzy logic distance

{d i s t}_{θ, ϑ} (x, y)

between the values

x, y \in [0, 1]

with

θ = ϑ = 0.5

; and (b) Euclidean distance between the values

x, y \in [0, 1]

.

Figure 1. (a) Fuzzy logic distance

{d i s t}_{θ, ϑ} (x, y)

between the values

x, y \in [0, 1]

with

θ = ϑ = 0.5

; and (b) Euclidean distance between the values

x, y \in [0, 1]

.

Figure 2. Example of the data and the resulting cluster centers for a single cluster of normally distributed instances. The number of instances is

n = 100

, the number of clusters is

m = 25

, and the number of iterations

T = 10

. Initially, the cluster centers are in the nodes of the grid and are depicted by black points. The instances are distributed normally around the mean

μ = 0.5

and are shown by white circles. The cluster centers calculated by Algorithm 1 with the Euclidean distance are depicted by gray diamonds and the cluster centers calculated by the Algorithm 2 with fuzzy logical distance are depicted by black pentagrams.

Figure 2. Example of the data and the resulting cluster centers for a single cluster of normally distributed instances. The number of instances is

n = 100

, the number of clusters is

m = 25

, and the number of iterations

T = 10

. Initially, the cluster centers are in the nodes of the grid and are depicted by black points. The instances are distributed normally around the mean

μ = 0.5

and are shown by white circles. The cluster centers calculated by Algorithm 1 with the Euclidean distance are depicted by gray diamonds and the cluster centers calculated by the Algorithm 2 with fuzzy logical distance are depicted by black pentagrams.

Figure 3. The average Dann coefficient with respect to the initial number of clusters. The real number of clusters is

m = 10

and the algorithms started with a single cluster and continued up to

100

clusters.

Figure 3. The average Dann coefficient with respect to the initial number of clusters. The real number of clusters is

m = 10

and the algorithms started with a single cluster and continued up to

100

clusters.

Figure 4. Data with two predefined clusters and cluster centers calculated by the Algorithms 1 and 2: (a) number of clusters

m = 25

; and (b) number of clusters

m = 9

.

Figure 4. Data with two predefined clusters and cluster centers calculated by the Algorithms 1 and 2: (a) number of clusters

m = 25

; and (b) number of clusters

m = 9

.

Figure 5. Data with ten predefined clusters and cluster centers calculated by algorithms 1 and 2: (a) number of clusters

m = 25

; and (b) number of clusters

m = 9

.

Figure 5. Data with ten predefined clusters and cluster centers calculated by algorithms 1 and 2: (a) number of clusters

m = 25

; and (b) number of clusters

m = 9

.

Figure 6. Iris dataset and cluster centers calculated by the algorithms 1 and 2: (a) sepal length and petal length, (b) sepal width and petal width, (c) sepal length and petal width, and (d) sepal width and petal length. In the figures, the number of clusters

m = 25

, the number of instances

n = 150

and the number of iterations at each step is

T = 500

.

Figure 6. Iris dataset and cluster centers calculated by the algorithms 1 and 2: (a) sepal length and petal length, (b) sepal width and petal width, (c) sepal length and petal width, and (d) sepal width and petal length. In the figures, the number of clusters

m = 25

, the number of instances

n = 150

and the number of iterations at each step is

T = 500

.

Figure 7. Data with ten predefined clusters and cluster centers calculated by the Algorithms 1 and 2: (a) precision

ξ = 0.1

and number of clusters

m = 100

; and (b) precision

ξ = 0.05

and number of clusters

m = 400

.

Figure 7. Data with ten predefined clusters and cluster centers calculated by the Algorithms 1 and 2: (a) precision

ξ = 0.1

and number of clusters

m = 100

; and (b) precision

ξ = 0.05

and number of clusters

m = 400

.

Figure 8. Cluster centers calculated by the suggested Algorithm 2 (a) and by the MATLAB^® fcm function with the Euclidean (b), Mahalanobis (c), and exponential (d) distance measures. In all cases, the real number of clusters is

5

and the number of data instances is

n = 200

. Algorithm 2 starts with

m = 25

clusters and is terminated after

10

iterations, and function fcm defines an optimal number of clusters by trials with a number of clusters from

2

to

11

.

Figure 8. Cluster centers calculated by the suggested Algorithm 2 (a) and by the MATLAB^® fcm function with the Euclidean (b), Mahalanobis (c), and exponential (d) distance measures. In all cases, the real number of clusters is

5

and the number of data instances is

n = 200

. Algorithm 2 starts with

m = 25

clusters and is terminated after

10

iterations, and function fcm defines an optimal number of clusters by trials with a number of clusters from

2

to

11

.

Table 1. Dependence of the runtime

t_{r u n}

(in seconds) of the Algorithm 2 on the size

n

of the dataset and the number

m

of clusters.

Table 1. Dependence of the runtime

t_{r u n}

(in seconds) of the Algorithm 2 on the size

n

of the dataset and the number

m

of clusters.

$m$	$4$	$9$	$16$	$25$	$49$	$81$	$100$
$n = 10,000$	$0.167$	$0.367$	$0.639$	$1.026$	$2.125$	$3.632$	$4.641$
$n = 20,000$	$0.316$	$0.709$	$1.314$	$2.054$	$4.266$	$7.375$	$9.185$
$n = 30,000$	$0.491$	$1.077$	$1.947$	$3.178$	$6.499$	$11.175$	$14.325$
$n = 40,000$	$0.607$	$1.438$	$2.598$	$4.142$	$9.451$	$14.870$	$18.914$
$n = 50,000$	$0.793$	$1.790$	$3.257$	$5.162$	$10.724$	$18.525$	23.913

Table 2. Results of the simulations for each dimension: means

μ_{x}

,

μ_{y}

of the instances which are the centers of the cluster, mean squared errors

e_{x}

,

e_{y}

of the centers calculated by the Algorithms 1 and 2, means of the standard deviations

σ_{x}

,

σ_{y}

of the cluster centers calculated by the Algorithms 1 and 2 and significance of the difference between the standard deviations of the cluster centers calculated by the Algorithms 1 and 2.

Table 2. Results of the simulations for each dimension: means

μ_{x}

,

μ_{y}

of the instances which are the centers of the cluster, mean squared errors

e_{x}

,

e_{y}

of the centers calculated by the Algorithms 1 and 2, means of the standard deviations

σ_{x}

,

σ_{y}

of the cluster centers calculated by the Algorithms 1 and 2 and significance of the difference between the standard deviations of the cluster centers calculated by the Algorithms 1 and 2.

#Instances, #Clusters	Source	Mean $μ_{x}$ $, μ_{y}$	MSE $e_{x}$ $, e_{y}$	Mean STD $σ_{x}$ $, σ_{y}$	Significance of the STD’s Difference
$n = 100,$ $m = 25$	Data	$0.499 0.507$	−	−	−
	Algorithm 1	$0.498 0.506$	$9.3 \times 10^{- 4} 9.4 \times 10^{- 4}$	$0.254 0.247$	$Significant, α = 0.95$
	Algorithm 2	$0.494 0.503$	$3.7 \times 10^{- 4} 2.6 \times 10^{- 4}$	$0.211 0.206$	$Significant, α = 0.95$
$n = 100,$ $m = 16$	Data	$0.491 0.490$	−	−	−
	Algorithm 1	$0.493 0.499$	$1.14 \times 10^{- 3} 9.16 \times 10^{- 4}$	$0.247 0.251$	$Significant, α = 0.95$
	Algorithm 2	$0.488 0.489$	$4.03 \times 10^{- 4} 4.22 \times 10^{- 4}$	$0.210 0.211$	$Significant, α = 0.95$
$n = 100,$ $m = 9$	Data	$0.501 0.487$	−	−
	Algorithm 1	$0.505 0.489$	$9.17 \times 10^{- 4} 6.13 \times 10^{- 4}$	$0.223 0.213$	$Significant, α = 0.95$
	Algorithm 2	$0.502 0.488$	$3.33 \times 10^{- 4} 2.06 \times 10^{- 4}$	$0.190 0.187$	$Significant, α = 0.95$

Table 3. Results of validation by the silhouette criterion.

#Instances, #Clusters	Source	Mean of the Silhouette Criterion	Significance of the Difference
$n = 80,$ $m = 2$	Algorithm 1	$0.576$	$Not significant, α = 0.95$
$n = 80,$ $m = 2$	Algorithm 2	$0.585$	$Not significant, α = 0.95$
$n = 120,$ $m = 3$	Algorithm 1	$0.550$	$Not significant, α = 0.95$
$n = 120,$ $m = 3$	Algorithm 2	$0.559$	$Not significant, α = 0.95$
$n = 160,$ $m = 4$	Algorithm 1	$0.542$	$Not significant, α = 0.95$
$n = 160,$ $m = 4$	Algorithm 2	$0.550$	$Not significant, α = 0.95$
$n = 200,$ $m = 5$	Algorithm 1	$0.530$	$Not significant, α = 0.95$
$n = 200,$ $m = 5$	Algorithm 2	$0.538$	$Not significant, α = 0.95$
$n = 320,$ $m = 8$	Algorithm 1	$0.520$	$Not significant, α = 0.95$
$n = 320,$ $m = 8$	Algorithm 2	$0.524$	$Not significant, α = 0.95$
$n = 400,$ $m = 10$	Algorithm 1	$0.509$	$Not significant, α = 0.95$
$n = 400,$ $m = 10$	Algorithm 2	$0.510$	$Not significant, α = 0.95$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kagan, E.; Novoselsky, A.; Rybalov, A. Fuzzy Clustering with Uninorm-Based Distance Measure. Mathematics 2025, 13, 1661. https://doi.org/10.3390/math13101661

AMA Style

Kagan E, Novoselsky A, Rybalov A. Fuzzy Clustering with Uninorm-Based Distance Measure. Mathematics. 2025; 13(10):1661. https://doi.org/10.3390/math13101661

Chicago/Turabian Style

Kagan, Evgeny, Alexander Novoselsky, and Alexander Rybalov. 2025. "Fuzzy Clustering with Uninorm-Based Distance Measure" Mathematics 13, no. 10: 1661. https://doi.org/10.3390/math13101661

APA Style

Kagan, E., Novoselsky, A., & Rybalov, A. (2025). Fuzzy Clustering with Uninorm-Based Distance Measure. Mathematics, 13(10), 1661. https://doi.org/10.3390/math13101661

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Fuzzy Clustering with Uninorm-Based Distance Measure

Abstract

1. Introduction

2. Materials and Methods

2.1. Fuzzy c-Means Algorithm

2.2. Uninorm and Absorbing Norm

3. Results

3.1. Logical Distance Based on the Uninorm and Absorbing Norm

3.2. The c-Means Algorithm with Fuzzy Logical Distance

3.3. Numerical Simulations

4. Discussion

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI