R-Norm Entropy and R-Norm Divergence in Fuzzy Probability Spaces

Markechová, Dagmar; Mosapour, Batool; Ebrahimzadeh, Abolfazl

doi:10.3390/e20040272

Open AccessArticle

R-Norm Entropy and R-Norm Divergence in Fuzzy Probability Spaces

by

Dagmar Markechová

^1,*,

Batool Mosapour

² and

Abolfazl Ebrahimzadeh

³

¹

Department of Mathematics, Faculty of Natural Sciences, Constantine The Philosopher University in Nitra, A. Hlinku 1, Nitra SK-949 01, Slovakia

²

Department of Mathematics, Farhangian University, Kerman +98-7616914111, Iran

³

Young Researchers and Elite Club, Zahedan Branch, Islamic Azad University, Zahedan +98-9816883673, Iran

^*

Author to whom correspondence should be addressed.

Entropy 2018, 20(4), 272; https://doi.org/10.3390/e20040272

Submission received: 12 March 2018 / Revised: 4 April 2018 / Accepted: 9 April 2018 / Published: 11 April 2018

(This article belongs to the Section Information Theory, Probability and Statistics)

Download Versions Notes

Abstract

:

In the presented article, we define the R-norm entropy and the conditional R-norm entropy of partitions of a given fuzzy probability space and study the properties of the suggested entropy measures. In addition, we introduce the concept of R-norm divergence of fuzzy P-measures and we derive fundamental properties of this quantity. Specifically, it is shown that the Shannon entropy and the conditional Shannon entropy of fuzzy partitions can be derived from the R-norm entropy and conditional R-norm entropy of fuzzy partitions, respectively, as the limiting cases for R going to 1; the Kullback–Leibler divergence of fuzzy P-measures may be inferred from the R-norm divergence of fuzzy P-measures as the limiting case for R going to 1. We also provide numerical examples that illustrate the results.

Keywords:

fuzzy measurable space; fuzzy P-measure; fuzzy probability space; fuzzy partition; R-norm entropy; conditional R-norm entropy; R-norm divergence

1. Introduction

The concept of information entropy was introduced by Claude Shannon in 1948 in his article [1]. It is used in information theory [2] to quantify the amount of information or uncertainty inherent in a system. We remind that Shannon’s entropy is defined in the context of a probabilistic model. Consider a measurable partition

A = {E_{1}, E_{2}, \dots, E_{n}}

of a probability space

(X, S, P)

(that is a finite collection of measurable subsets

E_{k} \subset X,

k = 1, 2, \dots, n,

such that

\cup_{k = 1}^{n} E_{k} = X,

and

E_{i} \cap E_{j} = \emptyset

whenever

i \neq j

) with probabilities

p_{k} = P (E_{k}),

k = 1, 2, \dots, n .

Then the Shannon entropy of

A

is defined as the number

h (A) = - \sum_{k = 1}^{n} p_{k} \cdot \log_{b} p_{k}

with the convention that

0 \cdot \log_{b} 0 = 0

(which is justified by the fact that

\lim_{p \to 0} p \cdot \log_{b} \frac{1}{p} = 0

). The base of the logarithm can be any positive real number; depending on the selected base of the logarithm, the entropy is expressed in bits (

b = 2

), nats (

b = e

), or dits (

b = 10

).

The extensions of Shannon’s entropy have led to several alternatives of entropy measure, of which the Rényi entropy [3] is one of the most important. The classical logical entropy (cf. [4,5]) and the entropy measure called the R-norm entropy (cf. [6,7]) are other alternative entropy measures. In this article, we will deal with the study of the R-norm entropy. If

P = {p_{1}, p_{2}, \dots, p_{n}}

is a probability distribution, then the R-norm entropy is defined, for every real number

R \in (0, 1) \cup (1, \infty),

by the formula:

H_{R} (P) = \frac{R}{R - 1} (1 - {[\sum_{k = 1}^{n} p_{k}^{R}]}^{\frac{1}{R}}) .

Some results regarding the R-norm entropy measure and its generalizations can be found in [8,9,10,11,12]. The above entropy measures have found many important applications, for example, in statistics, pattern recognition, and coding theory.

In classical probability theory, partitions are defined in the context of the Cantor set theory. In solving many real-life problems, however, the partitions defined in terms of fuzzy set theory [13] are more appropriate. Therefore, many proposals have been made to generalize the classical partitions into fuzzy partitions [14,15,16,17,18,19,20]. Fuzzy partitions represent a mathematical tool for modeling random experiments that lead to unclear, vague events. Naturally, there are also many results concerning the Shannon entropy of fuzzy partitions; see e.g., [21,22,23,24,25,26,27,28,29,30,31]. We note that in [32] the results regarding the entropy of fuzzy partitions provided in [25] have been employed to introduce the notions of mutual information and Kullback–Leibler divergence for the fuzzy case. The notion of Kullback–Leibler divergence was introduced in [33] as the distance measure between two probability distributions. It plays significant roles in information theory and various disciplines such as statistics, machine learning, physics, neuroscience, computer science, linguistics, etc.

Since its inception in 1965, the theory of fuzzy sets has advanced in many mathematical disciplines and has found important applications in practice. Currently, the subjects of intense study are algebraic systems based on the theory of fuzzy sets, for example, D-posets [34,35,36], MV-algebras [37,38,39,40,41], and effect algebras [42]. Some results regarding the above entropy measures and divergence on these structures can be found e.g., in [43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58].

The aim of this article is to study the R-norm entropy of fuzzy partitions and the R-norm divergence in fuzzy probability spaces [59]. The organization of the paper is as follows. In Section 2 we provide basic definitions, terminology and some known results used in the paper. The results of the article are presented in Section 3 and Section 4. In Section 3, we define the R-norm entropy and conditional R-norm entropy of fuzzy partitions and examine their properties. In Section 4, the concept of the R-norm divergence for the case of fuzzy probability spaces is proposed and the properties of this distance measure are studied. The results presented in Section 3 and Section 4 are illustrated with numerical examples. The paper concludes in Section 5 with a brief summary.

2. Preliminaries

We begin by recalling the basic concepts and the known results used in the paper.

It is known that the concept of fuzzy set, introduced by Zadeh in 1965 [13], extends the classical set theory. In classical set theory, the membership of elements in a set is assessed in binary terms according to the condition—an element either belongs or does not belong to the considered set. By contrast, the fuzzy set is characterized by a membership function which assigns to every element a grade of membership ranging between zero and one. The mathematical model of the fuzzy set is as follows. Let X be a non-empty set. By a fuzzy subset of X we mean a mapping

a : X \to [0, 1]

(where the considered fuzzy set is identified with its membership function). The value

a (x)

is interpreted as a grade of membership of the element

x \in X

to the considered fuzzy set

a

.

Definition 1.

Let X be a non-empty set, and

M \subset {[0, 1]}^{X}

be a family of fuzzy subsets of X. The pair

(X, M)

is called a fuzzy measurable space, if the following conditions are satisfied: (i)

1_{X} \in M, {(1 / 2)}_{X} \notin M;

(ii)

a \in M \Rightarrow a^{⊥} \in M;

(iii) if

a_{n} \in M, n = 1, 2, \dots,

then

\cup_{n = 1}^{\infty} a_{n} \in M .

The family

M \subset {[0, 1]}^{X}

with the properties (i)–(iii) is said to be a fuzzy

σ

-algebra.

Throughout the paper, the symbols

\cup_{n = 1}^{\infty} a_{n}

and

\cap_{n = 1}^{\infty} a_{n}

denote the fuzzy union and the fuzzy intersection of a sequence

{a_{n}}_{n = 1}^{\infty} \subset M,

respectively, by Zadeh [13], i.e.,

\cup_{n = 1}^{\infty} a_{n} = \sup_{n} a_{n},

and

\cap_{n = 1}^{\infty} a_{n} = \inf_{n} a_{n} .

The symbol

a^{⊥}

denotes the complement of fuzzy set

a \in M,

i.e.,

a^{⊥} = 1_{X} - a .

Here,

1_{X}

denotes the constant function with the value 1; analogously, the symbols

{(1 / 2)}_{X}

and

0_{X}

denote the constant functions with the value

1 / 2,

and 0, respectively. Additionally, the relation

\leq

denotes the usual order relation of fuzzy subsets of X, i.e.,

a \leq b

if and only if

a (x) \leq b (x),

for every

x \in X .

The complementation

⊥ : a \to a^{⊥},

a \in M,

satisfies, for every

a, b \in M,

the conditions: (i)

{(a^{⊥})}^{⊥} = a,

and (ii)

a \leq b

implies

b^{⊥} \leq a^{⊥} .

Fuzzy subsets

a, b \in M

with the property

a \cap b = 0_{X}

are said to be separated, fuzzy subsets

a, b \in M

with the property

a \leq b^{⊥}

are said to be W-separated fuzzy sets. Any fuzzy subset

a \in M

with the property

a \geq a^{⊥}

is said to be a W-universum, any fuzzy subset

a \in M

with the property

a \leq a^{⊥}

is said to be a W-empty fuzzy set. A fuzzy set from the fuzzy

σ

-algebra M is interpreted as a fuzzy event. W-separated fuzzy events are considered to be mutually exclusive events. A W-universum is interpreted as a certain event, a W-empty set as an impossible event. It can be shown that a fuzzy subset

a \in M

is a W-universum if and only if there exists a fuzzy subset

b \in M

such that

a = b \cup b^{⊥} .

Naturally, the notion of a fuzzy measurable space generalizes the concept of a measurable space

(X, S)

from the classical measure theory; it suffices to put

M = {χ_{E} : E \in S},

where

χ_{E}

is the characteristic function of the set

E \in S .

With this procedure, the classical model can be inserted into the fuzzy one.

Definition 2

([59]). Let

(X, M)

be a fuzzy measurable space. A map

μ : M \to [0, 1]

is said to be a fuzzy P-measure, the following conditions being satisfied: (i)

μ (a \cup a^{⊥}) = 1,

for every

a \in M;

(ii) if

{a_{n}}_{n = 1}^{\infty}

is a sequence of pairwise W-separated fuzzy sets from M, then

μ (\cup_{n = 1}^{\infty} a_{n}) = \sum_{n = 1}^{\infty} μ (a_{n}) .

The triplet

(X, M, μ)

is called a fuzzy probability space.

The fuzzy P-measure

μ : M \to [0, 1]

has the properties that correspond to properties of a classical probability measure; the proofs can be found in [59].

(P1): $μ (a^{⊥}) = 1 - μ (a),$ for every $a \in M .$
(P2): $μ$ is non-decreasing, i.e., if $a, b \in M$ with $a \leq b,$ then $μ (a) \leq μ (b) .$
(P3): $μ (a \cup b) + μ (a \cap b) = μ (a) + μ (b),$ for every $a, b \in M .$
(P4): Let $b \in M .$ Then $μ (a \cap b) = μ (a)$ for all $a \in M$ if and only if $μ (b) = 1 .$
(P5): If $a, b \in M$ such that $a \leq b^{⊥},$ then $μ (a \cap b) = 0 .$
(P6): If $a, b \in M$ such that $a \leq b,$ then $μ (b) = μ (a) + μ (a^{⊥} \cap b) .$

Definition 3

([14]). A fuzzy partition of a fuzzy probability space

(X, M, μ)

is a collection

A = {a_{1}, a_{2}, \dots, a_{n}}

of W-separated fuzzy sets from M with the property

μ (\cup_{i = 1}^{n} a_{i}) = 1 .

In the system of all fuzzy partitions of

(X, M, μ),

we define the refinement partial order in the following way. If A and B are two fuzzy partitions of

(X, M, μ),

then we say that B is a refinement of A (and write

A ≺ B),

if for every

b \in B

there exists

a \in A

such that

b \leq a .

Furthermore, for every two fuzzy partitions

A = {a_{1}, a_{2}, \dots, a_{n}},

and

B = {b_{1}, b_{2}, \dots, b_{m}}

of

(X, M, μ),

we put

A \lor B =

{a_{i} \cap b_{j}; i = 1, 2, \dots, n, j = 1, 2, \dots, m} .

One can easily to verify that the family

A \lor B

is a family of pairwise W-separated fuzzy sets from M; moreover, by the property (P4), we have

μ (\cup_{i = 1}^{n} \cup_{j = 1}^{m} (a_{i} \cap b_{j})) = μ ((\cup_{i = 1}^{n} a_{i}) \cap (\cup_{j = 1}^{m} b_{j})) = μ (\cup_{i = 1}^{n} a_{i}) = 1 .

Thus,

A \lor B

is a fuzzy partition of

(X, M, μ) .

It represents a combined experiment consisting of a realization of the experiments A and B. Evidently, it holds

A ≺ A \lor B,

and

B ≺ A \lor B,

i.e., the fuzzy partition

A \lor B

is a common refinement of fuzzy partitions A and B. If

A_{1}, A_{2}, \dots, A_{n}

are fuzzy partitions of

(X, M, μ),

then we put

\lor_{i = 1}^{n} A_{i} = A_{1} \lor A_{2} \lor \dots \lor A_{n} .

Definition 4.

Two fuzzy partitions

A = {a_{1}, a_{2}, \dots, a_{n}},

and

B = {b_{1}, b_{2}, \dots, b_{m}}

of a fuzzy probability space

(X, M, μ)

are said to be statistically independent, if

μ (a_{i} \cap b_{j}) = μ (a_{i}) \cdot μ (b_{j}),

for

i = 1, 2, \dots, n,

j = 1, 2, \dots, m .

Example 1.

Let us consider a classical probability space

(X, S, P),

and put

M = {χ_{E} : E \in S} .

It can be verified that the map

μ : M \to [0, 1]

defined by

μ (χ_{E}) = P (E),

for every

χ_{E} \in M,

is a fuzzy P-measure and the triplet

(X, M, μ)

is a fuzzy probability space. A classical measurable partition

A = {E_{1}, E_{2}, \dots, E_{n}}

of a probability space

(X, S, P)

can be eventually regarded as a fuzzy partition of

(X, M, μ),

considering

χ_{E_{i}}

instead of

E_{i} .

The Shannon entropy of fuzzy partition of a fuzzy probability space

(X, M, μ)

has been introduced and examined in [23], see also [25].

Definition 5

([23]). We define the entropy of a fuzzy partition

A = {a_{1}, a_{2}, \dots, a_{n}}

of

(X, M, μ)

by Shannon’s formula:

H^{μ} (A) = - \sum_{i = 1}^{n} μ (a_{i}) \cdot \log μ (a_{i}) .

(1)

If

A = {a_{1}, a_{2}, \dots, a_{n}},

and

B = {b_{1}, b_{2}, \dots, b_{m}}

are two fuzzy partitions of

(X, M, μ),

then we define the conditional entropy of A given B by the formula:

H^{μ} (A / B) = - \sum_{i = 1}^{n} \sum_{j = 1}^{m} μ (a_{i} \cap b_{j}) \cdot \log \frac{μ (a_{i} \cap b_{j})}{μ (b_{j})}

(2)

with the convention that

0 \cdot \log \frac{0}{x} = 0

if

x \geq 0 .

The symbol log denotes the base 2 logarithm, so the Shannon entropy of fuzzy partition is expressed in bits. The entropy and the conditional entropy of fuzzy partitions have properties that correspond to properties of Shannon’s entropy of classical measurable partitions: for every fuzzy partitions

A, B, C

of a fuzzy probability space

(X, M, μ),

it holds:

(S1): $A ≺ B$ implies $H^{μ} (A) \leq H^{μ} (B);$
(S2): $H^{μ} (A \lor B / C) = H^{μ} (A / C) + H^{μ} (B / C \lor A);$
(S3): $A ≺ B$ implies $H^{μ} (A / C) \leq H^{μ} (B / C);$
(S4): $A ≺ B$ implies $H^{μ} (C / A) \geq H^{μ} (C / B);$
(S5): $H^{μ} (A / B) \leq H^{μ} (A)$ with the equality if and only if $A, B$ are statistically independent;
(S6): $H^{μ} (B \lor C / A) \leq H^{μ} (B / A) + H^{μ} (C / A);$
(S7): $H^{μ} (A \lor B) = H^{μ} (A) + H^{μ} (B / A) .$

The proofs can be found in [23,25]. We remark that in [15,16,17,18,19,20,21,22,26,27,28,29,30,31], other conceptions of fuzzy partitions and their entropy measures have been introduced. Whereas our approach is based on Zadeh’s connectives, in the referenced papers Zadeh’s connectives have been replaced by other fuzzy set operations.

We note that in [32], the concept of Kullback–Leibler divergence in the fuzzy probability space was introduced. Let

μ,

υ

be two fuzzy P-measures on a fuzzy measurable space

(X, M),

and

A = {a_{1}, a_{2}, \dots, a_{n}}

be a fuzzy partition of fuzzy probability spaces

(X, M, μ)

,

(X, M, υ) .

Then the Kullback–Leibler divergence of fuzzy P-measures

μ,

υ

with respect to

A

is defined as the number:

D_{A} (μ ∥ υ) = \sum_{i = 1}^{n} μ (a_{i}) \cdot \log \frac{μ (a_{i})}{υ (a_{i})}

(3)

with the convention that

x \log \frac{x}{0} = \infty

if

x > 0,

and

0 \cdot \log \frac{0}{x} = 0

if

x \geq 0

.

In the following sections, we will use the following known Minkowski inequality: for non-negative real numbers

x_{1}, x_{2}, \dots, x_{n}, y_{1}, y_{2}, \dots, y_{n},

it holds:

{[\sum_{i = 1}^{n} {x_{i}}^{R}]}^{\frac{1}{R}} + {[\sum_{i = 1}^{n} {y_{i}}^{R}]}^{\frac{1}{R}} \geq {[\sum_{i = 1}^{n} {(x_{i} + y_{i})}^{R}]}^{\frac{1}{R}}, for R > 1,

and

{[\sum_{i = 1}^{n} {x_{i}}^{R}]}^{\frac{1}{R}} + {[\sum_{i = 1}^{n} {y_{i}}^{R}]}^{\frac{1}{R}} \leq {[\sum_{i = 1}^{n} {(x_{i} + y_{i})}^{R}]}^{\frac{1}{R}}, for 0 < R < 1 .

Furthermore, we will use the Jensen inequality which states that for a real convex function

φ,

real numbers

x_{1}, x_{2}, \dots, x_{m}

in its domain and non-negative real numbers

α_{1}, α_{2}, \dots, α_{m}

with

\sum_{j = 1}^{m} α_{j} = 1,

it holds:

φ (\sum_{j = 1}^{m} α_{j} x_{j}) \leq \sum_{j = 1}^{m} α_{j} φ (x_{j}),

and the inequality is reversed if

φ

is a real concave function. The equality holds if and only if

x_{1} = x_{2} = \dots = x_{m}

or

φ

is linear.

In addition, we will use L’Hôpital’s rule: for functions

f

and

g

that are differentiable on an open interval U except possibly at a point

a \in U,

if

\lim_{x \to a} f (x) =

\lim_{x \to a} g (x) = 0,

g^{'} (x) \neq 0,

for every x in

U

with

x \neq a,

and

\lim_{x \to a} \frac{f^{'} (x)}{g^{'} (x)}

exists, then:

\lim_{x \to a} \frac{f (x)}{g (x)} = \lim_{x \to a} \frac{f^{'} (x)}{g^{'} (x)} .

3. The R-Norm Entropy of Fuzzy Partitions

In this part we define the R-norm entropy of a fuzzy partition and its conditional version and study the properties of these entropy measures. It is shown that as the limiting cases of the R-norm entropy and the conditional R-norm entropy of fuzzy partitions for R going to 1, we obtain the Shannon entropy

H^{μ} (A),

and the conditional Shannon entropy

H^{μ} (A / B),

respectively, expressed in nats.

Definition 6.

Let

A = {a_{1}, a_{2}, \dots, a_{n}}

be a fuzzy partition of a fuzzy probability space

(X, M, μ) .

The R-norm entropy of A with respect to

μ

is defined, for a positive real number R not equal to 1, by the formula:

H_{R}^{μ} (A) = \frac{R}{R - 1} (1 - {[\sum_{i = 1}^{n} μ {(a_{i})}^{R}]}^{\frac{1}{R}}) .

(4)

Remark 1.

For simplicity, we write

μ {(a_{i})}^{R}

instead of

{(μ (a_{i}))}^{R} .

In the following, we will write

\log \sum_{i = 1}^{n} μ {(a_{i})}^{R}

instead of

\log (\sum_{i = 1}^{n} μ {(a_{i})}^{R}) .

Theorem 1.

For arbitrary fuzzy partition

A

of a fuzzy probability space

(X, M, μ),

the R-norm entropy

H_{R}^{μ} (A)

is non-negative.

Proof.

Assume that

A = {a_{1}, a_{2}, \dots, a_{n}} .

We will consider two cases: the case of

R > 1,

and the case of

0 < R < 1 .

If

R > 1,

then

μ {(a_{i})}^{R} \leq μ (a_{i}),

for

i = 1, 2, \dots, n,

hence

\sum_{i = 1}^{n} μ {(a_{i})}^{R} \leq \sum_{i = 1}^{n} μ (a_{i}) = 1 .

This implies that

{[\sum_{i = 1}^{n} μ {(a_{i})}^{R}]}^{\frac{1}{R}} \leq 1 .

Since

\frac{R}{R - 1} > 0

for

R > 1,

it follows that

H_{R}^{μ} (A) =

\frac{R}{R - 1} (1 - {[\sum_{i = 1}^{n} μ {(a_{i})}^{R}]}^{\frac{1}{R}}) \geq 0 .

On the other hand, for

0 < R < 1,

it holds that

μ {(a_{i})}^{R} \geq μ (a_{i}),

for

i = 1, 2, \dots, n,

hence

\sum_{i = 1}^{n} μ {(a_{i})}^{R} \geq \sum_{i = 1}^{n} μ (a_{i}) = 1 .

It follows that

{[\sum_{i = 1}^{n} μ {(a_{i})}^{R}]}^{\frac{1}{R}} \geq 1 .

Since

\frac{R}{R - 1} < 0

for

0 < R < 1,

we obtain

H_{R}^{μ} (A) = \frac{R}{R - 1} (1 - {[\sum_{i = 1}^{n} μ {(a_{i})}^{R}]}^{\frac{1}{R}}) \geq 0 .

□

Example 2.

Let

X = [0, 1],

and

a : X \to [0, 1]

be defined by

a (x) = x, x \in X .

Consider a fuzzy measurable space

(X, M),

where

M = {1_{X}, 0_{X}, a, a^{⊥}, a \cup a^{⊥}, a \cap a^{⊥}} .

Then it can be easily verified that the mappings

μ : M \to [0, 1],

and

υ : M \to [0, 1]

defined by the equalities

μ (1_{X}) = μ (a \cup a^{⊥}) = 1,

μ (0_{X}) = μ (a \cap a^{⊥}) = 0,

μ (a) = μ (a^{⊥}) = \frac{1}{2},

υ (1_{X}) = υ (a \cup a^{⊥}) = 1,

υ (0_{X}) = υ (a \cap a^{⊥}) = 0,

υ (a) = \frac{1}{3},

υ (a^{⊥}) = \frac{2}{3},

are fuzzy P-measures and the systems

(X, M, μ),

(X, M, υ)

are fuzzy probability spaces. The sets

A = {a, a^{⊥}},

B = {a \cup a^{⊥}},

C = {1_{X}}

are fuzzy partitions of

(X, M, μ),

and

(X, M, υ)

such that

C ≺ B ≺ A .

We can calculate their R-norm entropy. Evidently,

H_{R}^{μ} (B) = H_{R}^{μ} (C) = H_{R}^{υ} (B) = H_{R}^{υ} (C) = 0;

in accordance with the natural requirement, experiments resulting in a certain event have zero R-norm entropy. Furthermore, we have:

H_{R}^{μ} (A) = \frac{R}{R - 1} (1 - {[{(\frac{1}{2})}^{R} + {(\frac{1}{2})}^{R}]}^{\frac{1}{R}}) = \frac{R}{R - 1} (1 - 2^{\frac{1 - R}{R}}),

H_{R}^{υ} (A) = \frac{R}{R - 1} (1 - {[{(\frac{1}{3})}^{R} + {(\frac{2}{3})}^{R}]}^{\frac{1}{R}}) .

If we put

R = \frac{1}{2},

then

H_{R}^{μ} (A) = 1,

H_{R}^{υ} (A) \dot{=} 0.9428;

for

R = \frac{1}{3},

we have

H_{R}^{μ} (A) = 1.5,

H_{R}^{υ} (A) \dot{=} 1.4236;

for

R = 2,

we have

H_{R}^{μ} (A) \dot{=} 0.58579,

and

H_{R}^{υ} (A) \dot{=} 0.50928 .

Definition 7.

Let

A = {a_{1}, a_{2}, \dots, a_{n}},

and

B = {b_{1}, b_{2}, \dots, b_{m}}

be two fuzzy partitions of a fuzzy probability space

(X, M, μ) .

Then the conditional R-norm entropy of A given B with respect to

μ

is defined, for a positive real number R not equal to 1, by the formula:

H_{R}^{μ} (A / B) = \frac{R}{R - 1} ({[\sum_{j = 1}^{m} μ {(b_{j})}^{R}]}^{\frac{1}{R}} - {[\sum_{j = 1}^{m} \sum_{i = 1}^{n} μ {(a_{i} \cap b_{j})}^{R}]}^{\frac{1}{R}}) .

(5)

Remark 2.

Let A be a fuzzy partition of a given fuzzy probability space

(X, M, μ) .

Evidently, if we put

B = {b},

where

b \in M

is a W-universum, then

H_{R}^{μ} (A / B) = H_{R}^{μ} (A) .

The following theorem shows the consistency of the conditional R-norm entropy

H_{R}^{μ} (A / B),

in the case of the limit of

R

going to 1, with the conditional Shannon entropy

H^{μ} (A / B)

defined by the formula (2), up to a positive multiplicative constant.

Theorem 2.

Let

A = {a_{1}, a_{2}, \dots, a_{n}},

and

B = {b_{1}, b_{2}, \dots, b_{m}}

be two fuzzy partitions of a given fuzzy probability space

(X, M, μ) .

Then

\lim_{R \to 1} H_{R}^{μ} (A / B) = c \cdot H^{μ} (A / B),

where

c = \frac{1}{\log e},

and

H^{μ} (A / B) =

- \sum_{i = 1}^{n} \sum_{j = 1}^{m} μ (a_{i} \cap b_{j}) \cdot \log \frac{μ (a_{i} \cap b_{j})}{μ (b_{j})} .

Proof.

In the proof, we use L’Hôpital’s rule

\lim_{R \to a} \frac{f (R)}{g (R)} = \lim_{R \to a} \frac{f^{'} (R)}{g^{'} (R)},

where in this case

a = 1 .

For every

R \in (0, 1) \cup (1, \infty),

we can write:

H_{R}^{μ} (A / B) = \frac{1}{1 - \frac{1}{R}} ({[\sum_{j = 1}^{m} μ {(b_{j})}^{R}]}^{\frac{1}{R}} - {[\sum_{j = 1}^{m} \sum_{i = 1}^{n} μ {(a_{i} \cap b_{j})}^{R}]}^{\frac{1}{R}}) = \frac{f (R)}{g (R)},

where

f, g

are continuous functions defined for

R \in (0, \infty)

in the following way:

f (R) = {[\sum_{j = 1}^{m} μ {(b_{j})}^{R}]}^{\frac{1}{R}} - {[\sum_{j = 1}^{m} \sum_{i = 1}^{n} μ {(a_{i} \cap b_{j})}^{R}]}^{\frac{1}{R}}, g (R) = 1 - \frac{1}{R} .

By continuity of the function

g,

we get

\lim_{R \to 1} g (R) = g (1) = 0 .

Furthermore, by continuity of the function

f,

and by the property (P4) of fuzzy P-measure

μ,

we get

\lim_{R \to 1} f (R) = f (1) = \sum_{j = 1}^{m} μ (b_{j}) - \sum_{j = 1}^{m} \sum_{i = 1}^{n} μ (a_{i} \cap b_{j}) = 1 - \sum_{j = 1}^{m} μ ((\cup_{i = 1}^{n} a_{i}) \cap b_{j}) = 1 - \sum_{j = 1}^{m} μ (b_{j}) = 1 - 1 = 0 .

Using L’Hôpital’s rule, this implies:

\lim_{R \to 1} H_{R}^{μ} (A / B) = \frac{\lim_{R \to 1} f^{'} (R)}{\lim_{R \to 1} g^{'} (R)}

under the assumption that the right-hand side exists. To find the derivative of the function

f (R)

we use the identity

b^{α} = e^{α \ln b} .

Let us calculate:

\frac{d}{d R} f (R) = e^{\frac{1}{R} \ln \sum_{j = 1}^{m} μ {(b_{j})}^{R}} \cdot (- \frac{1}{R^{2}} \cdot \ln \sum_{j = 1}^{m} μ {(b_{j})}^{R} + \frac{1}{R} \cdot \frac{1}{\sum_{j = 1}^{m} μ {(b_{j})}^{R}} \sum_{j = 1}^{m} μ {(b_{j})}^{R} \cdot \ln μ (b_{j})) - e^{\frac{1}{R} \ln \sum_{i = 1}^{n} \sum_{j = 1}^{m} μ {(a_{i} \cap b_{j})}^{R}} (- \frac{1}{R^{2}} \cdot \ln \sum_{i = 1}^{n} \sum_{j = 1}^{m} μ {(a_{i} \cap b_{j})}^{R} + \frac{1}{R} \cdot \frac{1}{\sum_{i = 1}^{n} \sum_{j = 1}^{m} μ {(a_{i} \cap b_{j})}^{R}} \sum_{i = 1}^{n} \sum_{j = 1}^{m} μ {(a_{i} \cap b_{j})}^{R} \cdot \ln μ (a_{i} \cap b_{j})) .

Since

\lim_{R \to 1} g^{'} (R) = \lim_{R \to 1} \frac{1}{R^{2}} = 1,

we obtain:

\lim_{R \to 1} H_{R}^{μ} (A / B) = \lim_{R \to 1} f^{'} (R) = \sum_{j = 1}^{m} μ (b_{j}) \cdot \ln μ (b_{j}) - \sum_{i = 1}^{n} \sum_{j = 1}^{m} μ (a_{i} \cap b_{j}) \cdot \ln μ (a_{i} \cap b_{j}) = \sum_{j = 1}^{m} \sum_{i = 1}^{n} μ (a_{i} \cap b_{j}) \cdot \ln μ (b_{j}) - \sum_{i = 1}^{n} \sum_{j = 1}^{m} μ (a_{i} \cap b_{j}) \cdot \ln μ (a_{i} \cap b_{j}) = - \sum_{i = 1}^{n} \sum_{j = 1}^{m} μ (a_{i} \cap b_{j}) \cdot (\frac{\log μ (a_{i} \cap b_{j})}{\log e} - \frac{\log μ (b_{j})}{\log e}) = - \frac{1}{\log e} \sum_{i = 1}^{n} \sum_{j = 1}^{m} μ (a_{i} \cap b_{j}) \cdot \log \frac{μ (a_{i} \cap b_{j})}{μ (b_{j})} . □

Theorem 3.

Let

A = {a_{1}, a_{2}, \dots, a_{n}}

be a fuzzy partition of a fuzzy probability space

(X, M, μ) .

Then

\lim_{R \to 1} H_{R}^{μ} (A) = c \cdot H^{μ} (A),

where

H^{μ} (A) = - \sum_{i = 1}^{n} μ (a_{i}) \cdot \log μ (a_{i}),

and

c = \frac{1}{\log e} .

Proof.

The claim is a direct consequence of the previous theorem; it suffices to put

B = {1_{X}} .

□

In the following, the properties of the R-norm entropy of fuzzy partitions are discussed.

Theorem 4.

For arbitrary fuzzy partitions

A, B

and

C

of a fuzzy probability space

(X, M, μ),

it holds that:

H_{R}^{μ} (A \lor B / C) = H_{R}^{μ} (A / C) + H_{R}^{μ} (B / A \lor C) .

(6)

Proof.

Suppose that

A = {a_{1}, a_{2}, \dots, a_{n}},

B = {b_{1}, b_{2}, \dots, b_{m}},

C = {c_{1}, c_{2}, \dots, c_{r}} .

Let us calculate:

H_{R}^{μ} (A \lor B / C) = \frac{R}{R - 1} ({[\sum_{k = 1}^{r} μ {(c_{k})}^{R}]}^{\frac{1}{R}} - {[\sum_{i = 1}^{n} \sum_{j = 1}^{m} \sum_{k = 1}^{r} μ {(a_{i} \cap b_{j} \cap c_{k})}^{R}]}^{\frac{1}{R}}) = \frac{R}{R - 1} ({[\sum_{k = 1}^{r} μ {(c_{k})}^{R}]}^{\frac{1}{R}} - {[\sum_{i = 1}^{n} \sum_{k = 1}^{r} μ {(a_{i} \cap c_{k})}^{R}]}^{\frac{1}{R}}) + \frac{R}{R - 1} ({[\sum_{i = 1}^{n} \sum_{k = 1}^{r} μ {(a_{i} \cap c_{k})}^{R}]}^{\frac{1}{R}} - {[\sum_{i = 1}^{n} \sum_{j = 1}^{m} \sum_{k = 1}^{r} μ {(a_{i} \cap b_{j} \cap c_{k})}^{R}]}^{\frac{1}{R}}) = H_{R}^{μ} (A / C) + H_{R}^{μ} (B / A \lor C) . □

Theorem 5.

For arbitrary fuzzy partitions

A, B

of a fuzzy probability space

(X, M, μ),

it holds that:

H_{R}^{μ} (A \lor B) = H_{R}^{μ} (A) + H_{R}^{μ} (B / A) .

(7)

Proof.

The claim is a direct consequence of the previous theorem; it suffices to put

C = {1_{X}} .

□

In the following theorem, using the notion of conditional R-norm entropy of fuzzy partitions, chain rules for the R-norm entropy of fuzzy partitions are established.

Theorem 6.

Let

A_{1}, A_{2}, \dots, A_{n}

and C be fuzzy partitions of a fuzzy probability space

(X, M, μ) .

Then, for

n = 2, 3, \dots,

the following equalities hold:

(i): $H_{R}^{μ} (A_{1} \lor A_{2} \lor \dots \lor A_{n}) = H_{R}^{μ} (A_{1}) + \sum_{i = 2}^{n} H_{R}^{μ} (A_{i} / \lor_{k = 1}^{i - 1} A_{k});$
(ii): $H_{R}^{μ} (\lor_{i = 1}^{n} A_{i} / C) = H_{R}^{μ} (A_{1} / C) + \sum_{i = 2}^{n} H_{R}^{μ} (A_{i} / (\lor_{k = 1}^{i - 1} A_{k}) \lor C) .$

Proof.

The proof can be done using mathematical induction and Theorems 4 and 5. □

In the following we prove that the R-norm entropy

H_{R}^{μ} (A)

is a concave function on the class of all fuzzy P-measures on a given fuzzy measurable space

(X, M) .

Proposition 1.

Let

μ,

υ

be two fuzzy P-measures on a given fuzzy measurable space

(X, M) .

Then, for every real number

λ \in [0, 1],

the map

λ μ + (1 - λ) υ : X \to [0, 1]

is a fuzzy P-measure on

(X, M) .

Proof.

It is straightforward. □

Theorem 7.

Let A be a fuzzy partition of fuzzy probability spaces

(X, M, μ),

(X, M, υ) .

Then, for every real number

λ \in [0, 1],

this inequality holds:

λ H_{R}^{μ} (A) + (1 - λ) H_{R}^{υ} (A) \leq H_{R}^{λ μ + (1 - λ) υ} (A) .

Proof.

Let

A = {a_{1}, a_{2}, \dots, a_{n}},

and

λ \in [0, 1] .

Putting

x_{i} = λ μ (a_{i}),

and

y_{i} = (1 - λ) υ (a_{i}),

for

i = 1, 2, \dots, n,

in the Minkowski inequality, we obtain for

R > 1 :

λ {[\sum_{i = 1}^{n} μ {(a_{i})}^{R}]}^{\frac{1}{R}} + (1 - λ) {[\sum_{i = 1}^{n} υ {(a_{i})}^{R}]}^{\frac{1}{R}} \geq {[\sum_{i = 1}^{n} {(λ μ (a_{i}) + (1 - λ) υ (a_{i}))}^{R}]}^{\frac{1}{R}},

and for

0 < R < 1

:

λ {[\sum_{i = 1}^{n} μ {(a_{i})}^{R}]}^{\frac{1}{R}} + (1 - λ) {[\sum_{i = 1}^{n} υ {(a_{i})}^{R}]}^{\frac{1}{R}} \leq {[\sum_{i = 1}^{n} {(λ μ (a_{i}) + (1 - λ) υ (a_{i}))}^{R}]}^{\frac{1}{R}} .

This means that the function

μ \mapsto {[\sum_{i = 1}^{n} μ {(a_{i})}^{R}]}^{\frac{1}{R}}

is convex in

μ

for

R > 1,

and concave in

μ

for

0 < R < 1 .

Therefore, the function

μ \mapsto 1 - {[\sum_{i = 1}^{n} μ {(a_{i})}^{R}]}^{\frac{1}{R}}

is concave in

μ

for

R > 1,

and convex in

μ

for

0 < R < 1 .

Evidently,

\frac{R}{R - 1} > 0

for

R > 1,

and

\frac{R}{R - 1} < 0

for

0 < R < 1 .

According to definition of the R-norm entropy

H_{R}^{μ} (A),

we obtain that for every

R \in (0, 1) \cup (1, \infty),

the R-norm entropy

μ \mapsto H_{R}^{μ} (A)

is a concave function on the family of all fuzzy P-measures on a given fuzzy measurable space

(X, M) .

Thus, for every

λ \in [0, 1],

it holds that:

λ H_{R}^{μ} (A) + (1 - λ) H_{R}^{υ} (A) \leq H_{R}^{λ μ + (1 - λ) υ} (A) . □

Proposition 2.

Let

A, B

be fuzzy partitions of a fuzzy probability space

(X, M, μ)

such that

A ≺ B .

Then there exists a partition

{I_{1}, I_{2}, \dots, I_{n}}

of the set

{1, 2, \dots, m}

such that

μ (a_{i}) = \sum_{j \in I_{i}} μ (b_{j}),

for

i = 1, 2, \dots, n .

Proof.

By the assumption, for every

b \in B

there exists

a \in A

such that

b \leq a .

Let us denote by

I_{i}

the subset of the set

{1, 2, \dots, m}

such that for every

j \in I_{i},

it holds that

b_{j} \leq a_{i},

i = 1, 2, \dots, n .

Then the set

{I_{1}, I_{2}, \dots, I_{n}}

is a partition of the set

{1, 2, \dots, m}

and

\cup_{j \in I_{i}} b_{j} \leq a_{i},

for

i = 1, 2, \dots, n .

By monotonicity of fuzzy P-measure

μ,

μ (a_{i}) \geq μ (\cup_{j \in I_{i}} b_{j}) = \sum_{j \in I_{i}} μ (b_{j}),

for

i = 1, 2, \dots, n .

Summing over

i = 1, 2, \dots, n,

we get

\sum_{i = 1}^{n} μ (a_{i}) \geq \sum_{i = 1}^{n} \sum_{j \in I_{i}} μ (b_{j}) = \sum_{j = 1}^{m} μ (b_{j}) = 1 .

Since

\sum_{i = 1}^{n} μ (a_{i}) = 1,

it follows that

μ (a_{i}) = \sum_{j \in I_{i}} μ (b_{j}),

for

i = 1, 2, \dots, n .

□

Theorem 8.

Let

A, B, C

be fuzzy partitions of a fuzzy probability space

(X, M, μ)

such that

A ≺ B .

Then:

(i): $H_{R}^{μ} (A) \leq H_{R}^{μ} (B);$
(ii): $H_{R}^{μ} (A / C) \leq H_{R}^{μ} (B / C) .$

Proof.

(i) Assume that

A = {a_{1}, a_{2}, \dots, a_{n}},

B = {b_{1}, b_{2}, \dots, b_{m}},

A ≺ B .

According to Proposition 2 there exists a partition

{I_{1}, I_{2}, \dots, I_{n}}

of the set

{1, 2, \dots, m}

such that

μ (a_{i}) = \sum_{j \in I_{i}} μ (b_{j}),

for

i = 1, 2, \dots, n .

For the case of

R > 1,

we obtain:

μ {(a_{i})}^{R} = {(\sum_{j \in I_{i}} μ (b_{j}))}^{R} \geq {\sum_{j \in I_{i}} μ (b_{j})}^{R}, for i = 1, 2, \dots, n,

and consequently:

\sum_{i = 1}^{n} μ {(a_{i})}^{R} \geq \sum_{j = 1}^{m} μ {(b_{j})}^{R} .

Hence

{[\sum_{i = 1}^{n} μ {(a_{i})}^{R}]}^{\frac{1}{R}} \geq {[\sum_{j = 1}^{m} μ {(b_{j})}^{R}]}^{\frac{1}{R}} .

Since

\frac{R}{R - 1} > 0

for

R > 1,

we conclude that:

H_{R}^{μ} (A) = \frac{R}{R - 1} (1 - {[\sum_{i = 1}^{n} μ {(a_{i})}^{R}]}^{\frac{1}{R}}) \leq \frac{R}{R - 1} (1 - {[\sum_{j = 1}^{m} μ {(b_{j})}^{R}]}^{\frac{1}{R}}) = H_{R}^{μ} (B) .

For the case of

0 < R < 1,

we get:

μ {(a_{i})}^{R} = {(\sum_{j \in I_{i}} μ (b_{j}))}^{R} \leq {\sum_{j \in I_{i}} μ (b_{j})}^{R}, for i = 1, 2, \dots, n,

and consequently:

\sum_{i = 1}^{n} μ {(a_{i})}^{R} \leq \sum_{j = 1}^{m} μ {(b_{j})}^{R} .

Therefore:

{[\sum_{i = 1}^{n} μ {(a_{i})}^{R}]}^{\frac{1}{R}} \leq {[\sum_{j = 1}^{m} μ {(b_{j})}^{R}]}^{\frac{1}{R}} .

Since

\frac{R}{R - 1} < 0

for

0 < R < 1,

we have:

H_{R}^{μ} (A) = \frac{R}{R - 1} (1 - {[\sum_{i = 1}^{n} μ {(a_{i})}^{R}]}^{\frac{1}{R}}) \leq \frac{R}{R - 1} (1 - {[\sum_{j = 1}^{m} μ {(b_{j})}^{R}]}^{\frac{1}{R}}) = H_{R}^{μ} (B) .

(ii) By the assumption, for every

b \in B

there exists

a \in A

such that

b \leq a .

Hence, for arbitrary element

b \cap c

of fuzzy partition

B \lor C

there exists

a \cap c \in A \lor C

such that

b \cap c \leq a \cap c .

This means that

A \lor C ≺ B \lor C .

Therefore, we get:

H_{R}^{μ} (A / C) = H_{R}^{μ} (A \lor C) - H_{R}^{μ} (C) \leq H_{R}^{μ} (B \lor C) - H_{R}^{μ} (C) = H_{R}^{μ} (B / C) . □

Theorem 9.

Let

A, B

be statistically independent fuzzy partitions of a fuzzy probability space

(X, M, μ) .

Then:

H_{R}^{μ} (A / B) = H_{R}^{μ} (A) - \frac{R - 1}{R} H_{R}^{μ} (A) \cdot H_{R}^{μ} (B) .

Proof.

Let

A = {a_{1}, a_{2}, \dots, a_{n}},

and

B = {b_{1}, b_{2}, \dots, b_{m}} .

By the assumption,

μ (a_{i} \cap b_{j}) = μ (a_{i}) \cdot μ (b_{j}),

for

i = 1, 2, \dots, n, j = 1, 2, \dots, m .

Therefore we can write:

H_{R}^{μ} (A / B) = \frac{R}{R - 1} ({[\sum_{j = 1}^{m} μ {(b_{j})}^{R}]}^{\frac{1}{R}} - {[\sum_{j = 1}^{m} \sum_{i = 1}^{n} μ {(a_{i} \cap b_{j})}^{R}]}^{\frac{1}{R}}) = \frac{R}{R - 1} ({[\sum_{j = 1}^{m} μ {(b_{j})}^{R}]}^{\frac{1}{R}} - {[\sum_{j = 1}^{m} μ {(b_{j})}^{R}]}^{\frac{1}{R}} {[\sum_{i = 1}^{n} μ {(a_{i})}^{R}]}^{\frac{1}{R}}) = \frac{R}{R - 1} (1 - {[\sum_{i = 1}^{n} μ {(a_{i})}^{R}]}^{\frac{1}{R}} - 1 + {[\sum_{i = 1}^{n} μ {(a_{i})}^{R}]}^{\frac{1}{R}} + {[\sum_{j = 1}^{m} μ {(b_{j})}^{R}]}^{\frac{1}{R}} - {[\sum_{j = 1}^{m} μ {(b_{j})}^{R}]}^{\frac{1}{R}} {[\sum_{i = 1}^{n} μ {(a_{i})}^{R}]}^{\frac{1}{R}}) = \frac{R}{R - 1} (1 - {[\sum_{i = 1}^{n} μ {(a_{i})}^{R}]}^{\frac{1}{R}}) - \frac{R - 1}{R} (\frac{R}{R - 1} (1 - {[\sum_{i = 1}^{n} μ {(a_{i})}^{R}]}^{\frac{1}{R}}) \cdot \frac{R}{R - 1} (1 - {[\sum_{j = 1}^{m} μ {(b_{j})}^{R}]}^{\frac{1}{R}})) = H_{R}^{μ} (A) - \frac{R - 1}{R} H_{R}^{μ} (A) \cdot H_{R}^{μ} (B) . □

In view of Theorems 5 and 9, the R-norm entropy does not have the property of additivity, but it satisfies the property that is called pseudo-additivity, as stated in the following theorem.

Theorem 10.

(Pseudo-additivity). Let

A, B

be statistically independent fuzzy partitions of a fuzzy probability space

(X, M, μ) .

Then:

H_{R}^{μ} (A \lor B) = H_{R}^{μ} (A) + H_{R}^{μ} (B) - \frac{R - 1}{R} H_{R}^{μ} (A) \cdot H_{R}^{μ} (B) .

Proof.

The result follows by combining Theorems 5 and 9. □

4. The R-Norm Divergence of Fuzzy P-Measures

In this part, the concept of the R-norm divergence of fuzzy P-measures is defined. In order to avoid expressions like

\frac{0}{0},

we will use in this section the following simplification: for any fuzzy partition

A = {a_{1}, a_{2}, \dots, a_{n}}

of a fuzzy probability space

(X, M, μ

), we assume that

μ (a_{i}) >

0, for

i = 1, 2, \dots, n .

Note that this is without loss of generality, because

\sum_{i = 1}^{n} μ (a_{i}) = \sum_{i : μ (a_{i}) > 0} μ (a_{i})

. We will prove basic properties of this quantity. The results are illustrated with numerical examples.

Definition 8.

Let

μ,

υ

be two fuzzy P-measures on a fuzzy measurable space

(X, M),

and

A = {a_{1}, a_{2}, \dots, a_{n}}

be a fuzzy partition of fuzzy probability spaces

(X, M, μ),

(X, M, υ) .

The R-norm divergence of fuzzy P-measures

μ,

υ

with respect to

A

is defined, for a positive real number R not equal to 1, as the number:

D_{R}^{A} (μ ∥ υ) = \frac{R}{R - 1} ({[\sum_{i = 1}^{n} μ {(a_{i})}^{R} υ {(a_{i})}^{1 - R}]}^{\frac{1}{R}} - 1) .

(8)

Remark 3.

It is easy to see that, for any fuzzy partition

A

of a fuzzy probability space

(X, M, μ),

we have

D_{R}^{A} (μ ∥ μ) = 0 .

The following theorem states that the R-norm divergence

D_{R}^{A} (μ ∥ υ)

is consistent, in the case of the limit of R going to 1, with the Kullback–Leibler divergence

D_{A} (μ ∥ υ)

defined by formula (3), up to a positive multiplicative constant.

Theorem 11.

Let

A = {a_{1}, a_{2}, \dots, a_{n}}

be a fuzzy partition of fuzzy probability spaces

(X, M, μ),

(X, M, υ) .

Then

\lim_{R \to 1} D_{R}^{A} (μ ∥ υ) = c \cdot D_{A} (μ ∥ υ),

where

D_{A} (μ ∥ υ) = \sum_{i = 1}^{n} μ (a_{i}) \log \frac{μ (a_{i})}{υ (a_{i})},

and

c = \frac{1}{\log e} .

Proof.

For every

R \in (0, 1) \cup (1, \infty),

we can write:

D_{R}^{A} (μ ∥ υ) = \frac{1}{1 - \frac{1}{R}} ({[\sum_{i = 1}^{n} μ {(a_{i})}^{R} υ {(a_{i})}^{1 - R}]}^{\frac{1}{R}} - 1) = \frac{f (R)}{g (R)},

where

f, g

are continuous functions defined for

R \in (0, \infty)

by the formulas:

f (R) = {[\sum_{i = 1}^{n} μ {(a_{i})}^{R} υ {(a_{i})}^{1 - R}]}^{\frac{1}{R}} - 1, g (R) = 1 - \frac{1}{R} .

By continuity of the functions

f, g,

we get

\lim_{R \to 1} f (R) = f (1) = \sum_{i = 1}^{n} μ (a_{i}) υ {(a_{i})}^{0} - 1

= \sum_{i = 1}^{n} μ (a_{i}) - 1

= 1 - 1 = 0,

and

\lim_{R \to 1} g (R) = g (1) = 0 .

Using L’Hôpital’s rule this implies that:

\lim_{R \to 1} D_{R}^{A} (μ ∥ υ) = \frac{\lim_{R \to 1} f^{'} (R)}{\lim_{R \to 1} g^{'} (R)}

under the assumption that the right-hand side exists. Let us calculate the derivative of the function

f (R)

:

\frac{d}{d R} f (R) = e^{\frac{1}{R} \ln \sum_{i = 1}^{n} μ {(a_{i})}^{R} υ {(a_{i})}^{1 - R}} \cdot (- \frac{1}{R^{2}} \cdot \ln \sum_{i = 1}^{n} μ {(a_{i})}^{R} υ {(a_{i})}^{1 - R} + \frac{1}{R} \cdot \frac{1}{\sum_{i = 1}^{n} μ {(a_{i})}^{R} υ {(a_{i})}^{1 - R}} \sum_{i = 1}^{n} (- μ {(a_{i})}^{R} υ {(a_{i})}^{1 - R} \cdot \ln υ (a_{i}) + υ {(a_{i})}^{1 - R} μ {(a_{i})}^{R} \cdot \ln μ (a_{i}))) .

Since

\lim_{R \to 1} g^{'} (R) = \lim_{R \to 1} \frac{1}{R^{2}} = 1,

we get:

\lim_{R \to 1} D_{R}^{A} (μ ∥ υ) = \lim_{R \to 1} f^{'} (R) = \sum_{i = 1}^{n} (μ (a_{i}) \cdot \ln μ (a_{i}) - μ (a_{i}) \cdot \ln υ (a_{i})) = \sum_{i = 1}^{n} (μ (a_{i}) \cdot \frac{\log μ (a_{i})}{\log e} - μ (a_{i}) \cdot \frac{\log υ (a_{i})}{\log e}) = \frac{1}{\log e} \sum_{i = 1}^{n} μ (a_{i}) \cdot \log \frac{μ (a_{i})}{υ (a_{i})} = \frac{1}{\log e} D_{A} (μ ∥ υ) . □

Remark 4.

Evidently, if the Kullback–Leibler divergence

D_{A} (μ ∥ υ)

is expressed in terms of a natural logarithm, then it is the limiting case of the R-norm divergence

D_{R}^{A} (μ ∥ υ)

for R going to 1.

Let

A = {a_{1}, a_{2}, \dots, a_{n}}

be a fuzzy partition of fuzzy probability spaces

(X, M, μ),

(X, M, υ) .

In [32], it has been shown that for the Kullback–Leibler divergence

D_{A} (μ ∥ υ)

it holds the Gibbs inequality

D_{A} (μ ∥ υ) \geq 0

with the equality if and only if

μ (a_{i}) = υ (a_{i}),

for

i = 1, 2, \dots, n .

This result allows us to interpret the Kullback–Leibler divergence

D_{A} (μ ∥ υ)

as a distance measure between two fuzzy P-measures (over the same fuzzy partition). In the following theorem, we present an analogy of this result for the case of the R-norm divergence.

Theorem 12.

Let

A = {a_{1}, a_{2}, \dots, a_{n}}

be a fuzzy partition of fuzzy probability spaces

(X, M, μ),

(X, M, υ) .

Then

D_{R}^{A} (μ ∥ υ) \geq 0

with the equality if and only if

μ (a_{i}) = υ (a_{i}),

for

i = 1, 2, \dots, n .

Proof.

We shall consider two cases: the case of

R > 1,

and the case of

0 < R < 1 .

Consider the case of

R > 1 .

The inequality follows from Jensen’s inequality for the function

φ

defined by

φ (x) = x^{1 - R},

for every

x \in [0, \infty),

and putting

α_{i} = μ (a_{i}),

x_{i} = \frac{υ (a_{i})}{μ (a_{i})},

for

i = 1, 2, \dots, n .

The assumption that

R > 1

implies

1 - R < 0,

hence the function

φ

is convex. Therefore, by Jensen’s inequality we obtain:

1 = {(\sum_{i = 1}^{n} ν (a_{i}))}^{1 - R} = {(\sum_{i = 1}^{n} μ (a_{i}) \frac{ν (a_{i})}{μ (a_{i})})}^{1 - R} \leq {\sum_{i = 1}^{n} μ (a_{i}) (\frac{ν (a_{i})}{μ (a_{i})})}^{1 - R} = \sum_{i = 1}^{n} μ {(a_{i})}^{R} ν {(a_{i})}^{1 - R},

(9)

and consequently:

{[\sum_{i = 1}^{n} μ {(a_{i})}^{R} υ {(a_{i})}^{1 - R}]}^{\frac{1}{R}} \geq 1 .

Since

\frac{R}{R - 1} > 0

for

R > 1,

it follows that:

D_{R}^{A} (μ ∥ υ) = \frac{R}{R - 1} ({[\sum_{i = 1}^{n} μ {(a_{i})}^{R} υ {(a_{i})}^{1 - R}]}^{\frac{1}{R}} - 1) \geq 0 .

For

0 < R < 1,

the function

φ

defined by

φ (x) = x^{1 - R},

for every

x \in [0, \infty),

is concave. Hence, using the Jensen inequality, we obtain:

1 = {(\sum_{i = 1}^{n} ν (a_{i}))}^{1 - R} = {(\sum_{i = 1}^{n} μ (a_{i}) \frac{ν (a_{i})}{μ (a_{i})})}^{1 - R} \geq {\sum_{i = 1}^{n} μ (a_{i}) (\frac{ν (a_{i})}{μ (a_{i})})}^{1 - R} = \sum_{i = 1}^{n} μ {(a_{i})}^{R} ν {(a_{i})}^{1 - R},

and consequently:

{[\sum_{i = 1}^{n} μ {(a_{i})}^{R} υ {(a_{i})}^{1 - R}]}^{\frac{1}{R}} \leq 1 .

Since

\frac{R}{R - 1} < 0

for

0 < R < 1,

we conclude that:

D_{R}^{A} (μ ∥ υ) = \frac{R}{R - 1} ({[\sum_{i = 1}^{n} μ {(a_{i})}^{R} υ {(a_{i})}^{1 - R}]}^{\frac{1}{R}} - 1) \geq 0 .

The equality in (9) holds if and only if

\frac{ν (a_{i})}{μ (a_{i})}

is constant, for

i = 1, 2, \dots, n,

i.e., if and only if

ν (a_{i}) = c μ (a_{i}),

for

i = 1, 2, \dots, n .

Taking the sum over all

i = 1, 2, \dots, n,

we get

\sum_{i = 1}^{n} ν (a_{i}) =

c \sum_{i = 1}^{n} μ (a_{i}),

which implies that

c = 1 .

Therefore,

ν (a_{i}) = μ (a_{i}),

for

i = 1, 2, \dots, n .

This means that

D_{R}^{A} (μ ∥ υ) = 0

if and only if

μ (a_{i}) = υ (a_{i}),

for

i = 1, 2, \dots, n .

□

In the example that follows, it is shown that the equality

D_{R}^{A} (μ ∥ υ) = D_{R}^{A} (υ ∥ μ)

is not necessarily true which means that the R-norm divergence

D_{R}^{A} (μ ∥ υ)

is not symmetrical. Therefore, it is not a metric in a true sense.

Example 3.

Consider the fuzzy probability spaces

(X, M, μ),

(X, M, υ)

defined in Example 2 and the fuzzy partition

A = {a, a^{⊥}}

of

(X, M, μ),

(X, M, υ) .

Let us calculate the R-norm divergencies

D_{R}^{A} (μ ∥ υ),

and

D_{R}^{A} (υ ∥ μ) :

D_{R}^{A} (μ ∥ υ) = \frac{R}{R - 1} ({[{(\frac{1}{2})}^{R} {(\frac{1}{3})}^{1 - R} + {(\frac{1}{2})}^{R} {(\frac{2}{3})}^{1 - R}]}^{\frac{1}{R}} - 1), D_{R}^{A} (υ ∥ μ) = \frac{R}{R - 1} ({[{(\frac{1}{3})}^{R} {(\frac{1}{2})}^{1 - R} + {(\frac{2}{3})}^{R} {(\frac{1}{2})}^{1 - R}]}^{\frac{1}{R}} - 1) .

Put

R = 2 .

Elementary calculations show that

D_{R}^{A} (μ ∥ υ) \dot{=} 0.1213,

and

D_{R}^{A} (υ ∥ μ) \dot{=} 0.1082,

thus

D_{2}^{A} (μ ∥ υ) \neq D_{2}^{A} (υ ∥ μ) .

For

R = \frac{1}{3},

we have

D_{R}^{A} (μ ∥ υ) \dot{=} 0.0188,

and

D_{R}^{A} (υ ∥ μ)

\dot{=} 0.01908,

i.e.,

D_{1 / 3}^{A} (μ ∥ υ) \neq D_{1 / 3}^{A} (υ ∥ μ) .

This means that

D_{R}^{A} (μ ∥ υ) \neq D_{R}^{A} (υ ∥ μ),

in general.

Theorem 13.

Let

μ,

υ

be two fuzzy P-measures on a fuzzy measurable space

(X, M),

and

A = {a_{1}, a_{2}, \dots, a_{n}}

be a fuzzy partition of fuzzy probability spaces

(X, M, μ),

(X, M, υ) .

In addition, let

υ

be uniform over

A,

i.e.,

υ (a_{i}) = \frac{1}{n},

for

i = 1, 2, \dots, n .

Then, it holds that:

H_{R}^{μ} (A) = \frac{R}{R - 1} (1 - n^{\frac{1 - R}{R}}) - n^{\frac{1 - R}{R}} \cdot D_{R}^{A} (μ ∥ υ) .

(10)

Proof.

Let us calculate:

D_{R}^{A} (μ ∥ υ) = \frac{R}{R - 1} ({[\sum_{i = 1}^{n} μ {(a_{i})}^{R} υ {(a_{i})}^{1 - R}]}^{\frac{1}{R}} - 1) = \frac{R}{R - 1} ({[\sum_{i = 1}^{n} μ {(a_{i})}^{R} n^{R - 1}]}^{\frac{1}{R}} - 1) = \frac{R}{R - 1} \cdot n^{\frac{R - 1}{R}} {[\sum_{i = 1}^{n} μ {(a_{i})}^{R}]}^{\frac{1}{R}} - \frac{R}{R - 1} = - n^{\frac{R - 1}{R}} \cdot \frac{R}{R - 1} (1 - {[\sum_{i = 1}^{n} μ {(a_{i})}^{R}]}^{\frac{1}{R}}) + n^{\frac{R - 1}{R}} \cdot \frac{R}{R - 1} - \frac{R}{R - 1} = - n^{\frac{R - 1}{R}} \cdot H_{R}^{μ} (A) + \frac{R}{R - 1} (n^{\frac{R - 1}{R}} - 1) .

It follows that:

H_{R}^{μ} (A) = \frac{R}{R - 1} (1 - n^{\frac{1 - R}{R}}) - n^{\frac{1 - R}{R}} \cdot D_{R}^{A} (μ ∥ υ) . □

Example 4.

Consider the fuzzy P-measures

μ,

υ

from Example 2 and the fuzzy partition

A = {a, a^{⊥}}

of fuzzy probability spaces

(X, M, μ),

(X, M, υ) .

The fuzzy P-measure

μ

is uniform over A. Put

R = 2 .

Based on previous results, we have

D_{R}^{A} (υ ∥ μ) \dot{=} 0.1082,

and

H_{R}^{υ} (A) \dot{=} 0.50928 .

Let us calculate:

\frac{R}{R - 1} (1 - 2^{\frac{1 - R}{R}}) - 2^{\frac{1 - R}{R}} \cdot D_{R}^{A} (υ ∥ μ) \dot{=} 2 (1 - 2^{- \frac{1}{2}}) - 2^{- \frac{1}{2}} \cdot 0.1082 \dot{=} 0.50928 .

Thus, the Equality (10) holds.

As a direct consequence of Theorems 12 and 13, we obtain the following property of the R-norm entropy of fuzzy partitions:

Corollary 1.

For arbitrary fuzzy partition

A = {a_{1}, a_{2}, \dots, a_{n}}

of a fuzzy probability space

(X, M, μ),

it holds that:

H_{R}^{μ} (A) \leq \frac{R}{R - 1} (1 - n^{\frac{1 - R}{R}})

with the equality if and only if the fuzzy P-measure

μ

is uniform over A.

Theorem 14.

Let

μ_{1},

μ_{2},

υ

be fuzzy P-measures on a fuzzy measurable space

(X, M, μ_{1}),

and A be a fuzzy partition of fuzzy probability spaces

(X, M, μ_{2}),

(X, M, υ) .

Then, for every real number

λ \in [0, 1],

it holds that:

D_{R}^{A} (λ μ_{1} + (1 - λ) μ_{2} ∥ υ) \leq λ D_{R}^{A} (μ_{1} ∥ υ) + (1 - λ) D_{R}^{A} (μ_{2} ∥ υ) .

Proof.

Assume that

A = {a_{1}, a_{2}, \dots, a_{n}},

and

λ \in [0, 1] .

Putting

x_{i} = λ μ_{1} (a_{i}) υ {(a_{i})}^{\frac{1 - R}{R}},

and

y_{i} = (1 - λ) μ_{2} (a_{i}) υ {(a_{i})}^{\frac{1 - R}{R}},

i = 1, 2, \dots, n,

in the Minkowski inequality, we get for

R > 1 :

{[\sum_{i = 1}^{n} {(λ μ_{1} (a_{i}) + (1 - λ) μ_{2} (a_{i}))}^{R} υ {(a_{i})}^{1 - R}]}^{\frac{1}{R}} = {[\sum_{i = 1}^{n} {((λ μ_{1} (a_{i}) + (1 - λ) μ_{2} (a_{i})) υ {(a_{i})}^{\frac{1 - R}{R}})}^{R}]}^{\frac{1}{R}} = {[\sum_{i = 1}^{n} {(λ μ_{1} (a_{i}) υ {(a_{i})}^{\frac{1 - R}{R}} + (1 - λ) μ_{2} (a_{i}) υ {(a_{i})}^{\frac{1 - R}{R}})}^{R}]}^{\frac{1}{R}} \leq {[\sum_{i = 1}^{n} {(λ μ_{1} (a_{i}) υ {(a_{i})}^{\frac{1 - R}{R}})}^{R}]}^{\frac{1}{R}} + {[{\sum_{i = 1}^{n} ((1 - λ) μ_{2} (a_{i}) υ {(a_{i})}^{\frac{1 - R}{R}})}^{R}]}^{\frac{1}{R}} = λ {[\sum_{i = 1}^{n} μ_{1} {(a_{i})}^{R} υ {(a_{i})}^{1 - R}]}^{\frac{1}{R}} + (1 - λ) {[\sum_{i = 1}^{n} μ_{2} {(a_{i})}^{R} υ {(a_{i})}^{1 - R}]}^{\frac{1}{R}},

and for

0 < R < 1 :

{[\sum_{i = 1}^{n} {(λ μ_{1} (a_{i}) + (1 - λ) μ_{2} (a_{i}))}^{R} υ {(a_{i})}^{1 - R}]}^{\frac{1}{R}} \geq {[\sum_{i = 1}^{n} {(λ μ_{1} (a_{i}) υ {(a_{i})}^{\frac{1 - R}{R}})}^{R}]}^{\frac{1}{R}} + {[{\sum_{i = 1}^{n} ((1 - λ) μ_{2} (a_{i}) υ {(a_{i})}^{\frac{1 - R}{R}})}^{R}]}^{\frac{1}{R}} = λ {[\sum_{i = 1}^{n} μ_{1} {(a_{i})}^{R} υ {(a_{i})}^{1 - R}]}^{\frac{1}{R}} + (1 - λ) {[\sum_{i = 1}^{n} μ_{2} {(a_{i})}^{R} υ {(a_{i})}^{1 - R}]}^{\frac{1}{R}} .

This means that the function

μ \mapsto {[\sum_{i = 1}^{n} μ {(a_{i})}^{R} υ {(a_{i})}^{1 - R}]}^{\frac{1}{R}}

is convex in

μ

for

R > 1,

and concave in

μ

for

0 < R < 1 .

The same applies to the function

μ \mapsto {[\sum_{i = 1}^{n} μ {(a_{i})}^{R} υ {(a_{i})}^{1 - R}]}^{\frac{1}{R}} - 1 .

Since

\frac{R}{R - 1} > 0

for

R > 1,

and

\frac{R}{R - 1} < 0

for

0 < R < 1,

we conclude that the function

μ \mapsto D_{R}^{A} (μ ∥ υ)

is convex on the family of all fuzzy P-measures on a given fuzzy measurable space

(X, M) .

Thus, for every real number

λ \in [0, 1],

it holds that:

D_{R}^{A} (λ μ_{1} + (1 - λ) μ_{2} ∥ υ) \leq λ D_{R}^{A} (μ_{1} ∥ υ) + (1 - λ) D_{R}^{A} (μ_{2} ∥ υ) . □

Theorem 15.

Let

A = {a_{1}, a_{2}, \dots, a_{n}}

be any fuzzy partition of a fuzzy probability space

(X, M, μ) .

Then:

(i): $0 < R < 1$ implies $D_{R}^{A} (μ ∥ υ) \leq D_{A} (μ ∥ υ);$
(ii): $R > 1$ implies $D_{R}^{A} (μ ∥ υ) \geq D_{A} (μ ∥ υ),$
where

$D_{A} (μ ∥ υ) = \sum_{i = 1}^{n} μ (a_{i}) \log \frac{μ (a_{i})}{υ (a_{i})} .$

Proof.

In the proof we use the Jensen inequality for the concave function

φ

defined by

φ (x) = \log x,

for

x \in (0, \infty),

and putting

α_{i} = μ (a_{i}),

x_{i} = {(\frac{υ (a_{i})}{μ (a_{i})})}^{R - 1},

for

i = 1, 2, \dots, n .

Since the logarithm satisfies the condition

\log x \leq x - 1,

for all real numbers

x > 0,

we get:

\log {[\sum_{i = 1}^{n} μ {(a_{i})}^{R} υ {(a_{i})}^{1 - R}]}^{\frac{1}{R}} \leq {[\sum_{i = 1}^{n} μ {(a_{i})}^{R} υ {(a_{i})}^{1 - R}]}^{\frac{1}{R}} - 1 .

(11)

Suppose that

0 < R < 1 .

Then

\frac{R}{R - 1} < 0

and using the inequality (11) and the Jensen inequality, we can write:

D_{R}^{A} (μ ∥ υ) = \frac{R}{R - 1} ({[\sum_{i = 1}^{n} μ {(a_{i})}^{R} υ {(a_{i})}^{1 - R}]}^{\frac{1}{R}} - 1) \leq \frac{R}{R - 1} \log {[\sum_{i = 1}^{n} μ {(a_{i})}^{R} υ {(a_{i})}^{1 - R}]}^{\frac{1}{R}} = \frac{1}{R - 1} \log \sum_{i = 1}^{n} μ {(a_{i})}^{R} υ {(a_{i})}^{1 - R} = \frac{1}{R - 1} \log \sum_{i = 1}^{n} μ (a_{i}) {(\frac{μ (a_{i})}{υ (a_{i})})}^{R - 1} \leq \frac{1}{R - 1} \sum_{i = 1}^{n} μ (a_{i}) \log {(\frac{μ (a_{i})}{υ (a_{i})})}^{R - 1} = \sum_{i = 1}^{n} μ (a_{i}) \log \frac{μ (a_{i})}{υ (a_{i})} = D_{A} (μ ∥ υ) .

The case of

R > 1

can be obtained in the similar way. □

Example 5.

Consider the fuzzy P-measures

μ,

υ

from Example 2 and the fuzzy partition

A = {a, a^{⊥}}

of fuzzy probability spaces

(X, M, μ),

(X, M, υ) .

Based on the results from Example 3, we have

D_{2}^{A} (μ ∥ υ) \dot{=} 0.1213,

D_{2}^{A} (υ ∥ μ) \dot{=} 0.1082,

D_{1 / 3}^{A} (μ ∥ υ) \dot{=} 0.0188,

D_{1 / 3}^{A} (υ ∥ μ) \dot{=} 0.01908 .

By simple calculation we get that

D_{A} (μ ∥ υ) \dot{=} 0.084963,

and

D_{A} (υ ∥ μ) \dot{=} 0.081703 .

Thus, for

R = \frac{1}{3}

we have

D_{R}^{A} (μ ∥ υ) \leq D_{A} (μ ∥ υ),

D_{R}^{A} (υ ∥ μ) \leq D_{A} (υ ∥ μ);

and for

R = 2

we have

D_{R}^{A} (μ ∥ υ) \geq D_{A} (μ ∥ υ),

and

D_{R}^{A} (υ ∥ μ) \geq D_{A} (υ ∥ μ),

which is consistent with the statement in the previous theorem.

We conclude our contribution with the formulation of a chain rule for the R-norm divergence in the fuzzy case. First, we define the conditional version of the R-norm divergence of fuzzy P-measures.

Definition 9.

Let

A = {a_{1}, a_{2}, \dots, a_{n}},

B = {b_{1}, b_{2}, \dots, b_{m}}

be two fuzzy partitions of fuzzy probability spaces

(X, M, μ),

(X, M, υ) .

Then, we define the conditional divergence of fuzzy P-measures

μ,

υ

with respect to B assuming a realization of A, for a positive real number R not equal to 1, as the number:

D_{R}^{B / A} (μ ∥ υ) = \frac{R}{R - 1} ({[\sum_{i = 1}^{n} \sum_{j = 1}^{m} μ {(a_{i} \cap b_{j})}^{R} υ {(a_{i} \cap b_{j})}^{1 - R}]}^{\frac{1}{R}} - {[\sum_{i = 1}^{n} μ {(a_{i})}^{R} υ {(a_{i})}^{1 - R}]}^{\frac{1}{R}}) .

Theorem 16.

Let

A, B

be two fuzzy partitions of fuzzy probability spaces

(X, M, μ),

(X, M, υ) .

Then

D_{R}^{A \lor B} (μ ∥ υ) = D_{R}^{A} (μ ∥ υ) + D_{R}^{B / A} (μ ∥ υ) .

Proof.

Assume that

A = {a_{1}, a_{2}, \dots, a_{n}},

B = {b_{1}, b_{2}, \dots, b_{m}} .

Then we have:

D_{R}^{A} (μ ∥ υ) + D_{R}^{B / A} (μ ∥ υ) = \frac{R}{R - 1} ({[\sum_{i = 1}^{n} μ {(a_{i})}^{R} υ {(a_{i})}^{1 - R}]}^{\frac{1}{R}} - 1) + \frac{R}{R - 1} ({[\sum_{i = 1}^{n} \sum_{j = 1}^{m} μ {(a_{i} \cap b_{j})}^{R} υ {(a_{i} \cap b_{j})}^{1 - R}]}^{\frac{1}{R}} - {[\sum_{i = 1}^{n} μ {(a_{i})}^{R} υ {(a_{i})}^{1 - R}]}^{\frac{1}{R}}) = \frac{R}{R - 1} ({[\sum_{i = 1}^{n} \sum_{j = 1}^{m} μ {(a_{i} \cap b_{j})}^{R} υ {(a_{i} \cap b_{j})}^{1 - R}]}^{\frac{1}{R}} - 1) = D_{R}^{A \lor B} (μ ∥ υ) . □

5. Conclusions

In this article, we have extended the study of entropy measures and distance measures in the fuzzy case. Our goal was to introduce the concepts of R-norm entropy and R-norm divergence for the case of fuzzy probability spaces and to derive basic properties of these measures. Our results are presented in Section 3 and Section 4.

In Section 3, we have defined the R-norm entropy and conditional R-norm entropy of fuzzy partitions of a given fuzzy probability space and have examined the properties of the proposed entropy measures. In particular, it has been shown that the R-norm entropy of fuzzy partitions does not have the property of additivity, but it satisfies the property called pseudo-additivity, as stated in Theorem 10. In Theorem 6, chain rules for the R-norm entropy of fuzzy partitions are provided. Moreover, it was shown that the Shannon entropy and the conditional Shannon entropy of fuzzy partitions can be derived from the R-norm entropy and conditional R-norm entropy of fuzzy partitions, respectively, as the limiting cases for

R \to 1 .

In Section 4, the concept of R-norm divergence of fuzzy P-measures was introduced and the properties of this quantity have been proven. Specifically, it was shown that the Kullback–Leibler divergence defined and studied in [32] can be derived from the R-norm divergence of fuzzy P-measures, as the limiting case for

R \to 1 .

The result of Theorem 12 allows us to interpret the R-norm divergence as a distance measure between two fuzzy P-measures. Theorem 13 provides a relationship between the R-norm divergence and the R-norm entropy of fuzzy partitions; Theorem 15 provides a relationship between the R-norm divergence and the Kullback–Leibler divergence of fuzzy P-measures. In addition, the concavity of R-norm entropy (Theorem 7) and convexity of R-norm divergence (Theorem 14) have been demonstrated. Finally, using the suggested concept of conditional R-norm divergence of fuzzy P-measures, the chain rule for the R-norm divergence of fuzzy P-measures was established.

In the proofs, the Jensen inequality, L’Hôpital’s rule, and the Minkowski inequality were used. The results presented in Section 3 and Section 4 are illustrated with numerical examples.

Acknowledgments

The authors are grateful to the editor and the anonymous referees for their valuable comments and suggestions. The authors acknowledge Constantine the Philosopher University in Nitra for covering the costs of publishing in open access.

Author Contributions

All authors contributed significantly to the theoretical work as well as to the creation of illustrative examples. Dagmar Markechová wrote the paper. All authors have read and approved the final manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

Shannon, C.E. A Mathematical Theory of Communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef]
Gray, R.M. Entropy and Information Theory; Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
Rényi, A. On measures of information and entropy. In Proceedings of the Fourth Berkeley Symposium on Mathematics, Statistics and Probability, Berkeley, CA, USA, 20 June–30 July 1960; University of California Press: Berkeley, CA, USA, 1961; pp. 547–561. [Google Scholar]
Ellerman, D. An Introduction to Logical Entropy and Its Relation to Shannon Entropy. Int. J. Semant. Comput. 2013, 7, 121–145. [Google Scholar] [CrossRef]
Ellerman, D. Logical Information Theory: New Foundations for Information Theory. Log. J. IGPL 2017, 25, 806–835. [Google Scholar] [CrossRef]
Arimoto, S. Information theoretical considerations on estimation problems. Inf. Control 1971, 19, 181–194. [Google Scholar] [CrossRef]
Boekke, D.E.; Van Der Lubbe, J.C.A. The R-Norm Information Measure. Inf. Control 1980, 45, 136–155. [Google Scholar] [CrossRef]
Hooda, D.S.; Ram, A. Characterization of a Generalized Measure of R-norm Entropy. Caribb. J. Math. Comput. Sci. 2002, 8, 18–31. [Google Scholar]
Hooda, D.S.; Sharma, D.K. Generalized R-norm Information Measures. J. Appl. Math. Stat. Inform. 2008, 4, 153–168. [Google Scholar]
Hooda, D.S.; Bajaj, R.K. On Generalized R-norm Information Measures of Fuzzy Information. J. Appl. Math. Stat. Inform. 2008, 4, 199–212. [Google Scholar]
Kumar, S.; Choudhary, A. Generalized Parametric R-norm Information Measure. Trends Appl. Sci. Res. 2012, 7, 350–369. [Google Scholar]
Kumar, S.; Choudhary, A.; Kumar, R. Some More Results on a Generalized Parametric R-Norm Information Measure of Type α. J. Appl. Sci. Eng. 2014, 17, 447–453. [Google Scholar] [CrossRef]
Zadeh, L.A. Fuzzy Sets. Inf. Control 1965, 8, 338–358. [Google Scholar] [CrossRef]
Piasecki, K. Fuzzy partitions of sets. BUSEFAL 1986, 25, 52–60. [Google Scholar]
De Baets, B.; Mesiar, R. T-Partitions. Fuzzy Sets Syst. 1998, 97, 211–223. [Google Scholar] [CrossRef]
Mesiar, R.; Reusch, B.; Thiele, H. Fuzzy equivalence relations and fuzzy partition. J. Mult. Valued Log. Soft Comput. 2006, 12, 167–181. [Google Scholar]
Jayaram, B.; Mesiar, R. I-fuzzy equivalence relations and I-fuzzy partitions. Inf. Sci. 2009, 179, 1278–1297. [Google Scholar] [CrossRef]
Montes, S.; Couso, I.; Gil, P. Fuzzy delta-epsilon partitions. Inf. Sci. 2003, 152, 267–285. [Google Scholar] [CrossRef]
Montes, S.; Couso, I.; Gil, P. One-to-one correspondence between ε-partitions, (1 − ε)-equivalences and ε-pseudometrics. Fuzzy Sets Syst. 2001, 124, 87–95. [Google Scholar] [CrossRef]
Dumitrescu, D. Fuzzy partitions with the connectives T_∞, S_∞. Fuzzy Sets Syst. 1992, 47, 193–195. [Google Scholar] [CrossRef]
Dumitrescu, D. Fuzzy measures and entropy of fuzzy partitions. J. Math. Anal. Appl. 1993, 176, 359–373. [Google Scholar] [CrossRef]
Riečan, B. An entropy construction inspired by fuzzy sets. Soft Comput. 2003, 7, 486–488. [Google Scholar]
Markechová, D. The entropy of fuzzy dynamical systems and generators. Fuzzy Sets Syst. 1992, 48, 351–363. [Google Scholar] [CrossRef]
Markechová, D. Entropy of complete fuzzy partitions. Math. Slovaca 1993, 43, 1–10. [Google Scholar]
Markechová, D. Entropy and mutual information of experiments in the fuzzy case. Neural Netw. World 2013, 23, 339–349. [Google Scholar] [CrossRef]
Mesiar, R. The Bayes principle and the entropy on fuzzy probability spaces. Int. J. Gen. Syst. 1991, 20, 67–72. [Google Scholar] [CrossRef]
Mesiar, R.; Rybárik, J. Entropy of Fuzzy Partitions—A General Model. Fuzzy Sets Syst. 1998, 99, 73–79. [Google Scholar] [CrossRef]
Rahimi, M.; Riazi, A. On local entropy of fuzzy partitions. Fuzzy Sets Syst. 2014, 234, 97–108. [Google Scholar] [CrossRef]
Srivastava, P.; Khare, M.; Srivastava, Y.K. m-Equivalence, entropy and F-dynamical systems. Fuzzy Sets Syst. 2001, 121, 275–283. [Google Scholar] [CrossRef]
Khare, M. Fuzzy σ-algebras and conditional entropy. Fuzzy Sets Syst. 1999, 102, 287–292. [Google Scholar] [CrossRef]
Markechová, D.; Riečan, B. Entropy of Fuzzy Partitions and Entropy of Fuzzy Dynamical Systems. Entropy 2016, 18, 19. [Google Scholar] [CrossRef]
Markechová, D. Kullback–Leibler Divergence and Mutual Information of Experiments in the Fuzzy Case. Axioms 2017, 6, 5. [Google Scholar] [CrossRef]
Kullback, S.; Leibler, R.A. On information and sufficiency. Ann. Math. Stat. 1951, 22, 79–86. [Google Scholar] [CrossRef]
Kôpka, F.; Chovanec, F. D-posets. Math. Slovaca 1994, 44, 21–34. [Google Scholar]
Kôpka, F. Quasiproduct on Boolean D-posets. Int. J. Theor. Phys. 2008, 47, 26–35. [Google Scholar] [CrossRef]
Frič, R. On D-posets of fuzzy sets. Math. Slovaca 2014, 64, 545–554. [Google Scholar] [CrossRef]
Riečan, B.; Mundici, D. Probability on MV-algebras. In Handbook of Measure Theory; Pap, E., Ed.; Elsevier: Amsterdam, The Netherlands, 2002; pp. 869–910. [Google Scholar]
Mundici, D. MV Algebras: A Short Tutorial. 2007. Available online: http://www.matematica.uns.edu.ar/IX CongresoMonteiro/Comunicaciones/Mundici_tutorial.pdf (accessed on 26 May 2007).
Jakubík, J. On product MV algebras. Czechosolv. Math. J. 2002, 52, 797–810. [Google Scholar] [CrossRef]
Di Nola, A.; Dvurečenskij, A. Product MV-algebras. Mult. Valued Log. 2001, 6, 193–215. [Google Scholar]
Dvurečenskij, A.; Pulmannová, S. New Trends in Quantum Structures; Springer: Dordrecht, The Netherlands, 2000. [Google Scholar]
Foulis, D.J.; Bennet, M.K. Effect algebras and unsharp quantum logics. Found. Phys. 1994, 24, 1331–1352. [Google Scholar] [CrossRef]
Markechová, D.; Riečan, B. Logical Entropy of Fuzzy Dynamical Systems. Entropy 2016, 18, 157. [Google Scholar] [CrossRef]
Di Nola, A.; Dvurečenskij, A.; Hyčko, M.; Manara, C. Entropy on Effect Algebras with the Riesz Decomposition Property II: MV-Algebras. Kybernetika 2005, 41, 161–176. [Google Scholar]
Riečan, B. Kolmogorov–Sinaj entropy on MV-algebras. Int. J. Theor. Phys. 2005, 44, 1041–1052. [Google Scholar] [CrossRef]
Di Nola, A.; Dvurečenskij, A.; Hyčko, M.; Manara, C. Entropy on Effect Algebras with the Riesz Decomposition Property I: Basic Properties. Kybernetika 2005, 41, 143–160. [Google Scholar]
Eslami Giski, Z.; Ebrahimi, M. Entropy of Countable Partitions on effect Algebras with the Riesz Decomposition Property and Weak Sequential Effect Algebras. Cankaya Univ. J. Sci. Eng. 2015, 12, 20–39. [Google Scholar]
Ebrahimi, M.; Mosapour, B. The Concept of Entropy on D-posets. Cankaya Univ. J. Sci. Eng. 2013, 10, 137–151. [Google Scholar]
Petrovičová, J. On the entropy of partitions in product MV-algebras. Soft Comput. 2000, 4, 41–44. [Google Scholar] [CrossRef]
Petrovičová, J. On the entropy of dynamical systems in product MV-algebras. Fuzzy Sets Syst. 2001, 121, 347–351. [Google Scholar] [CrossRef]
Markechová, D.; Riečan, B. Kullback–Leibler Divergence and Mutual Information of Partitions in Product MV Algebras. Entropy 2017, 19, 267. [Google Scholar] [CrossRef]
Markechová, D.; Mosapour, B.; Ebrahimzadeh, A. Logical Divergence, Logical Entropy, and Logical Mutual Information in Product MV-Algebras. Entropy 2018, 20, 129. [Google Scholar] [CrossRef]
Ebrahimzadeh, A.; Eslami Giski, Z.; Markechová, D. Logical Entropy of Dynamical Systems—A General Model. Mathematics 2017, 5, 4. [Google Scholar] [CrossRef]
Ebrahimzadeh, A. Logical entropy of quantum dynamical systems. Open Phys. 2016, 14, 1–5. [Google Scholar] [CrossRef]
Ebrahimzadeh, A. Quantum conditional logical entropy of dynamical systems. Ital. J. Pure Appl. Math. 2016, 36, 879–886. [Google Scholar]
Ebrahimzadeh, A.; Jamalzadeh, J. Conditional logical entropy of fuzzy σ-algebras. J. Intell. Fuzzy Syst. 2017, 33, 1019–1026. [Google Scholar] [CrossRef]
Eslami Giski, Z.; Ebrahimzadeh, A. An introduction of logical entropy on sequential effect algebra. Indag. Math. 2017, 28, 928–937. [Google Scholar] [CrossRef]
Mohammadi, U. The Concept of Logical Entropy on D-posets. J. Algebra Struct. Appl. 2016, 1, 53–61. [Google Scholar]
Piasecki, K. Probability of fuzzy events defined as denumerable additive measure. Fuzzy Sets Syst. 1985, 17, 271–284. [Google Scholar] [CrossRef]

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Markechová, D.; Mosapour, B.; Ebrahimzadeh, A. R-Norm Entropy and R-Norm Divergence in Fuzzy Probability Spaces. Entropy 2018, 20, 272. https://doi.org/10.3390/e20040272

AMA Style

Markechová D, Mosapour B, Ebrahimzadeh A. R-Norm Entropy and R-Norm Divergence in Fuzzy Probability Spaces. Entropy. 2018; 20(4):272. https://doi.org/10.3390/e20040272

Chicago/Turabian Style

Markechová, Dagmar, Batool Mosapour, and Abolfazl Ebrahimzadeh. 2018. "R-Norm Entropy and R-Norm Divergence in Fuzzy Probability Spaces" Entropy 20, no. 4: 272. https://doi.org/10.3390/e20040272

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

R-Norm Entropy and R-Norm Divergence in Fuzzy Probability Spaces

Abstract

1. Introduction

2. Preliminaries

3. The R-Norm Entropy of Fuzzy Partitions

4. The R-Norm Divergence of Fuzzy P-Measures

5. Conclusions

Acknowledgments

Author Contributions

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI