## 1. Introduction

Let

X be a discrete random variable with support

$\{{x}_{1},\dots ,{x}_{n}\}$ and with probability mass function vector

$\underline{p}=({p}_{1},\dots ,{p}_{n})$. The Shannon entropy of

X is defined as

where log is the natural logarithm, see [

1]. It is a measure of information and discrimination about the uncertainty related to the random variable

X. Lad et al. [

2] proposed the extropy as the dual of entropy. This is a measure of uncertainty related to the outside and is defined as

Lad et al. proved the following property related to the sum of the entropy and the extropy:

where

$H({p}_{i},1-{p}_{i})=J({p}_{i},1-{p}_{i})=-{p}_{i}log{p}_{i}-(1-{p}_{i})log(1-{p}_{i})$ are the entropy and the extropy of a discrete random variable of which the support has cardinality two and the probability mass function vector is

$({p}_{i},1-{p}_{i})$.

Dempster [

3] and Shafer [

4] introduced a method to study uncertainty. Their theory of evidence is a generalization of the classical probability theory. In D-S theory, an uncertain event with a finite number of alternatives is considered, and a mass function over the power set of the alternatives (i.e., a degree of confidence to all of its subsets) is defined. If we give positive mass only to singletons, a probability mass function is obtained. D-S theory allows us to describe more general situations in which there is less specific information.

Here we describe an example studied in [

5] to explain how D-S theory extends the classical probability theory. Consider two boxes,

A and

B such that in

A there are only red balls, whereas in

B there are only green balls and the number of balls in each box is unknown. A ball is picked randomly from one of these two boxes. The box

A is selected with probability

${p}_{A}=0.6$ and box

B is selected with probability

${p}_{B}=0.4$. Hence, the probability of picking a red ball is 0.6,

$\mathbb{P}\left(R\right)=0.6$, whereas the probability of picking a green ball is 0.4,

$\mathbb{P}\left(G\right)=0.4$. Suppose now that in box

B there are green and red balls with rates unknown. The box

A is still selected with probability

${p}_{A}=0.6$ and the box

B with probability

${p}_{B}=0.4$. In this case, we can not obtain the probability of picking a red ball. To analyze this problem, we can use D-S theory to express the uncertainty. In particular, we choose a mass function

m such that,

$m\left(R\right)=0.6$ and

$m(R,G)=0.4$.

Dempster–Shafer theory of evidence has several applications due to its advantages in dealing with uncertainty; for example, it is used in decision making [

6,

7], in risk evaluation [

8,

9], in reliability analysis [

10,

11] and so on. In the following, we recall the basic notions of this theory.

Let

X be a frame of discernment, i.e., a set of mutually exclusive and collectively exhaustive events indicated by

$X=\{{\theta}_{1},{\theta}_{2},\dots ,{\theta}_{\left|X\right|}\}$. The power set of

X is indicated by

${2}^{X}$ and it has cardinality

${2}^{\left|X\right|}$. A function

$m:{2}^{X}\to [0,1]$ is called a mass function or a basic probability assignment (BPA) if

If

$m\left(A\right)\ne 0$ implies

$\left|A\right|=1$ then

m is also a probability mass function, i.e., BPAs generalize discrete random variables. Moreover, elements

A such that

$m\left(A\right)>0$ are called focal elements. Given a BPA, we can evaluate for each focal element the pignistic probability transformation (PPT). Let us recall that the pignistic probability is the probability that a thinking being would assign to that event. It represents a point estimate of belief and can be determined as [

12]

If we have a weight or a reliability of evidence, represented by a coefficient

$\alpha \in [0,1]$, we can use it to generate another BPA

${m}^{\alpha}$ in the following way (see [

4])

If we have two BPAs

${m}_{1},\phantom{\rule{4pt}{0ex}}{m}_{2}$ for a frame of discernment

X, we can introduce another BPA

${m}^{\ast}$ for

X using the Dempster rule of combination, see [

3]. We define

${m}^{\ast}\left(A\right)$,

$A\subseteq X$, in the following way

where

$K={\sum}_{B,C\subseteq X:B\cap C=\varnothing}{m}_{1}\left(B\right){m}_{2}\left(C\right)$. We remark that, if

$K>1$, we can not apply the Dempster rule of combination.

Recently, several measures of discrimination and uncertainty have been proposed in the literature (see, for instance, [

13,

14,

15,

16,

17,

18,

19]). In particular, in the context of the Dempster–Shafer evidence theory, there are interesting measures of discrimination, as the Deng entropy; it was this latter concept that has suggested us the introduction of a dual definition.

The Deng entropy was introduced in [

5] for a BPA

m as

This entropy is similar to Shannon entropy and they coincide if the BPA is also a probability mass function. The term ${2}^{\left|A\right|}-1$ represents the potential number of states in A. For a fixed value of $m\left(A\right)$, as the cardinality of A increases, ${2}^{\left|A\right|}-1$ increases and then also Deng entropy does.

In the literature, several properties of Deng entropy have been studied (see for instance [

20]) and other measures of uncertainty based on Deng entropy have been introduced (see [

21,

22]). Other relevant measures of uncertainty and information known in the Dempster–Shafer theory of evidence are, for example, Hohle’s confusion measure [

23], Yager’s dissonance measure [

24] and Klir and Ramer’s discord measure [

25].

The aim of this paper is to dualize the Deng entropy by defining a corresponding extropy. We present some examples of comparing Deng entropy and our new extropy and their monotonicities. Then investigate the relations between these measures, and the behaviour of the Deng extropy under changes of focal elements. Moreover, a characterization result is given for the maximum Deng extropy. Finally, an application to pattern recognition is given in which it is clear that the Deng extropy can make the right recognition. In the conclusions, the results contained in the paper are summarized.

## 4. Application to Pattern Recognition

In this section, we investigate an application of the Deng extropy in pattern recognition by using the dataset Iris given in [

27]. This example was already studied in [

28] to analyze the applications of another belief entropy defined in the Dempster–Shafer evidence theory. We compare a method proposed by Kang [

29] with a method based on the use of Deng extropy. The Iris dataset is useful to introduce the application of the generation of BPAs based on the Deng extropy in the classification of the kind of flowers. The dataset is composed of 150 samples. For each one, we have the sepal length in cm (SL), the sepal width in cm (SW), the petal length in cm (PL), the petal width in cm (PW) and the class that is only one between Iris Setosa (Se), Iris Versicolour (Ve) and Iris Virginica (Vi). The samples are equally distributed for each class. We select 40 samples for each kind of Iris and then we use sample of max–min value to generate a model of interval numbers, as shown in

Table 2. Each element of the dataset can be regarded as an unknown test sample. Suppose the selected sample data is [6.1, 3.0, 4.9, 1.8, Iris Virginica].

Four BPAs are generated with a method proposed by Kang et al. based on the similarity of interval numbers [

29]. Given two intervals

$A=[{a}_{1},{a}_{2}]$ and

$B=[{b}_{1},{b}_{2}]$ their similarity

$S(A,b)$ is defined as

where

$\alpha >0$ is the coefficient of support (we choose

$\alpha =5$) and

$D(A,B)$ is the distance of intervals

A and

B defined in [

30] as

In order to generate BPAs, the intervals given in

Table 2 are used as interval

A and as interval

B we use singletons given by the selected sample. For each one of the four properties, we get seven values of similarity and then we get a BPA by normalizing them (see

Table 3). Hence, we evaluate the Deng extropy of these BPAs, as shown in the bottom row of

Table 3. We obtain a combined BPA by using the Dempster rule of Combination (

6). The type of unknown sample is determined by combined BPA. From Equation (

4), we get the maximum value of PPT. Hence, Kang’s method assigns to the sample the type Iris Versicolour and it does not make the right decision.

Next, we use the Deng extropies given in

Table 3 to generate other BPAs. We refer to these extropies as

$E{X}_{d}\left(SL\right),\phantom{\rule{4pt}{0ex}}E{X}_{d}\left(SW\right),\phantom{\rule{4pt}{0ex}}E{X}_{d}\left(PL\right),\phantom{\rule{4pt}{0ex}}E{X}_{d}\left(PW\right)$. We use these values as significance of each sample property to evaluate the weight of each property. For the sample property sepal length we have

We divide each weight for the maximum of the weights and use these values as discounting coefficients to generate new BPAs, as shown in Equation (

5), see

Table 4. Again, a combined BPA is obtained by using the Dempster rule of combination. The type of unknown sample is determined by combined BPA. Hence, the method based on the Deng extropy can make the right recognition.

We tested all 150 samples and we get that the global recognition rate of Kang’s method is 93.33% whereas the global recognition of the method based on the Deng extropy is 94%. The results are shown in

Table 5.