Fractional Deng Entropy and Extropy and Some Applications

Deng entropy and extropy are two measures useful in the Dempster–Shafer evidence theory (DST) to study uncertainty, following the idea that extropy is the dual concept of entropy. In this paper, we present their fractional versions named fractional Deng entropy and extropy and compare them to other measures in the framework of DST. Here, we study the maximum for both of them and give several examples. Finally, we analyze a problem of classification in pattern recognition in order to highlight the importance of these new measures.


Introduction
The concept of entropy as a measure of uncertainty was first introduced by Shannon [1], and since then, it has been used in the field of information theory, image and signal processing and economics. Let X be a discrete random variable with probability mass function vector p = (p 1 , . . . , p n ). The Shannon entropy of X is defined as follows where log(·) stands for the natural logarithm with the convention 0 log 0 = 0. Recently, the dual measure of entropy has become widespread. It is known as extropy and was defined for a discrete random variable X by Lad et al. [2] as and since then, as the Shannon entropy, it has been studied in several contexts and in its differential version [3][4][5][6].
The generalization of Shannon entropy to various fields is always of great interest. Ubriaco [7] defined a new entropy based on fractional calculus as follows: The fractional entropy is concave, positive and non-additive. Moreover, for q = 1, the fractional entropy reduces to the Shannon entropy. From a physical sense, it also satisfies Lesche and thermodynamic stability.
The purpose of this paper is to extend to the fractional case of Deng entropy and extropy. Deng entropy and extropy [8,9] are two measures of uncertainty known in the context of the Dempster-Shafer theory (DST) of evidence. The DST of evidence [10,11] is a generalization of the classical probability theory. In DST, an uncertain event with a finite number of alternatives is considered, and a mass function over the power set of the alternatives, considered as a degree of confidence, is defined. DST allows us to describe more general situations in which there is less specific information with respect to the classical probability theory. DST has several applications due to its advantages in dealing with uncertainty; for example, it is used in reliability analysis [12,13], in decision making [14,15], and so on [16,17]. Now, we describe an example given in [8] to explain how DST extends the classical probability theory. Consider two boxes, A and B, such that in A, there are only red balls, whereas in B, there are only green balls and the number of balls in each box is unknown. A ball is picked randomly from one of the boxes. The box A is chosen with probability p A = 0.6 and box B is selected with probability p B = 0.4. Thus, the probability of picking up a red ball is 0.6, P(R) = 0.6, and the probability of picking a green ball is 0.4, P(G) = 0.4. Now, suppose in box B there are green and red balls with rates unknown and p A , p B are unchanged. In this case, we cannot obtain the probability of picking up a red ball. To overcome this problem, we can use DST to express the uncertainty. In particular, we choose a mass function m, such that m(R) = 0.6 and m(R, G) = 0.4.
The rest of the paper is organized as follows. In Section 2, we recall the basic notions of the Dempster-Shafer theory of evidence and some of the most important measures of uncertainty in this context. In Section 3, we define and study the fractional Deng entropy. In Section 4, we introduce the fractional Deng extropy, and several examples are given. In Section 5, we apply fractional Deng entropy and fractional Deng extropy to a problem of classification. Finally, in Section 6, we give conclusions and summarize the results obtained in the paper.

Preliminaries
In this section, we review some basic definitions in the Dempster-Shafer evidence theory (DST) [10,11] and Deng entropy [8]. Definition 1. Let X = {θ 1 , θ 2 , . . . , θ i , . . . , θ |X| } be a finite set of mutually exclusive and collectively exhaustive events, X is the frame of discernment (FOD). The power set of X consists of 2 |X| elements denoted as follows: Definition 2. (Mass function) Given a FOD X = {θ 1 , θ 2 , . . . , θ i , . . . , θ |X| }, a mapping m from 2 X to [0, 1] is called a mass function, or basic probability assignment (BPA), formally defined by: In DST, m(A) represents how strongly the evidence supports A. Then, m(A) measures the belief exactly assigned to A. If m(A) > 0, then A is called a focal element.
Recently, some operations on BPA are presented, such as negation [18] and correlation [19]. In several applications, we need to generate a new BPA starting from independent BPAs or from a weight of evidence represented by a coefficient α ∈ (0, 1]. In DST, there are different indices to evaluate the degree of belief in a subset of FOD. Among them, here we recall the definitions of belief function, plausibility function and pignistic probability transformation (PPT).

Some Uncertainty Measures for the Dempster-Shafer Framework
In the context of the DST, there are interesting measures of discrimination, such as Deng entropy; it has many advantages in some cases, in comparison with other uncertainty measures in the DST framework. It was this latter concept that has suggested to us the introduction of a new extension. In Table 1, we present the definitions of some of the most important measures of uncertainty in DST.
Deng entropy degenerates to the Shannon entropy if, and only if, a positive mass function value is assigned only to singleton elements, which is . Deng entropy has attracted the interest of researchers, and several of its generalizations have been studied. In Table 2, we present some modified versions of Deng entropy.

Fractional Deng Entropy
In recent years, great attention has been given to fractional calculus. For this reason, several authors have studied various fractional entropies from the idea that they satisfy physical conditions of stability. In order to obtain an analog of (6), we introduce the concept of fractional Deng entropy in the following definition. Definition 6. Let m be a BPA on a FOD X. We define the Fractional Deng Entropy (FDEn) of m as Example 1.
(i) Assume that the FOD is X = {a, b, c}. For a mass function m(a) = m(b) = m(c) = 1 3 , the associated fractional entropy and FDEn are obtained as follows: It is obvious that, in this case, the FDEn is increasing in q ∈ (0, 1]. (ii) Assume there is a ∈ X such that m(a) = 1. The associated fractional entropy and FDEn coincide and are obtained as Clearly, we see that the results of fractional entropy and FDEn are identical when the BPA assigns a positive mass only to singletons. Moreover, if A ⊆ X exists such that m(A) > 0 and | A |> 1, we cannot evaluate the fractional entropy.
The plot of the FDEn as a function of q ∈ (0, 1] is given in Figure 1. From Figure 1, it is seen that E q d (m) is increasing in q and the maximum is achieved for q = 1, i.e., when the FDEn reduces to Deng entropy.
The plot of this FDEn is given in Figure 2. From Figure 2, it is seen that E q d (m) is increasing in q, and the maximum is achieved when FDEn reduces to Deng entropy.
In Figure 3, the plot of E q d (m) for different values of p a is given. It is seen that for p a = 0.01, p a = 0.80 and p a = 0.99, the plot of E q d (m) is increasing, upside-down bathtub shaped and decreasing, respectively.  In the above examples, it is seen that the function E q d (m) cannot be a concave function, and it can be increasing, decreasing and upside-down bathtub shape. Furthermore, the supremum FDEn is achieved when q is near to the boundary of interval (0, 1]. Therefore, we can state the following theorem. Theorem 1. Let m be a non-degenerate BPA on a FOD X and q ∈ (0, 1]. Then, the supremum FDEn as a function of q is attained for q ∈ {0, 1} and the infimum is attained in the extremes of interval (0, 1), or it is a minimum assumed in a unique q 0 ∈ (0, 1).
Proof. By noting that for fixed x > 0 the function g(p) = x p is a convex function of p we can conclude that the FDEn is a strictly convex function of q. Hence, we have three possible scenarios. In the first one, the FDEn is strictly increasing in q and hence it assumes the maximum value for q = 1, i.e., when it reduces to Deng entropy, and the infimum is 1 by the normalization condition. In the second scenario, the FDEn is strictly decreasing; hence, the supremum is 1 and the minimum is assumed for q = 1. In the third case, there is a unique stationary point in (0, 1), it is an absolute minimum, whereas the supremum is given by max{1, E d (m)}.
In the following theorem, we study the maximum FDEn for a fixed value of q. This is an important issue in the theory of measures of uncertainty; see, for instance, [30] for the study of the maximum Deng entropy.
Theorem 2. Let X be a FOD, q ∈ (0, 1] and m be a BPA, which assigns positive mass to each non-empty subset of X. The maximum FDEn is attained if the BPA m is defined as Proof. For a fixed q ∈ (0, 1] the FDEn is given by (7) as We have to maximize (9) subject to the constraint We use the method of Lagrange multipliers, and we have to compute the partial derivatives of the functioñ In order to vanish all the partial derivatives ofẼ 2 |A| −1 = K has to be invariant with respect to A. In fact, the function is strictly decreasing in z ∈ (0, 1) since and z < e 1−q . Hence, by the constraint (10), we get and the BPA m, which maximizes the FDEn, is given in (8). Then, the maximum FDEn is given by

Fractional Deng Extropy
In the following definition, we present the Deng extropy introduced by Buono and Longobardi [9] as a dual measure of uncertainty to Deng entropy.
Example 6. In Figure 5, the plot of EX q d (m) is given. One can see that as a function of q, it has a convex parabolic shape and the maximum is achieved when it reduces to Deng extropy.  In Figure 6, the plot of EX q d (m) is given. One can see that as a function of q, it has a convex parabolic shape and the maximum is achieved when q tends to zero. Similar to FDEn, in the above examples, it is seen that the function EX q d (m) cannot be a concave and it can be increasing, decreasing and upside-down bathtub shape. Furthermore, the supremum FDEx is achieved when q is near the boundary of interval (0, 1]. The following theorem is immediate. Theorem 3. Let m be a non-degenerate BPA on a FOD X and q ∈ (0, 1]. Then, the supremum FDEx as a function of q is attained for q ∈ {0, 1} and the infimum is attained in the extremes of interval (0, 1) or it is a minimum assumed in a unique q 0 ∈ (0, 1).

Proof.
The proof is similar to that of Theorem 1; in this case, the supremum is given by max{N − 1 + m(X), EX d (m)}, where N is the number of focal elements different form X.
Next, in analogy with Theorem 2, we obtain an upper bound for the maximum FDEx with a fixed value of q. Theorem 4. Let X be a FOD, q ∈ (0, 1] and m be a BPA that assigns positive mass to each non-empty subset of X. For a fixed value of m(X), an upper bound for the FDEx is assumed in correspondence of the fictitious BPAm such thatm(X) = m(X) and Proof. The proof is similar to the one given for Theorem 2. After establishing that 1−m(A) and, by summing over A ⊂ X Hence, the BPA which maximizes the FDEx is given in (12). We have to specify that it is a fictitious BPA, in the sense thatm(A) may be negative for some subset of X.

Example 10.
Based on the result of Theorem 4, let us evaluate the upper bound for FDEx in the case |X| = 3 with fixed m(X). We have three subsets of cardinality one and three of cardinality two, and then the upper bound in given by

Application to a Problem of Classification
In this section, we apply FDEn and FDEx to a problem of classification. We analyze a dataset given in [31] about typical qualities of Italian wines. This dataset is composed of 178 instances and, for each one, thirteen attributes are given. The instances of the dataset are divided into three classes of wine: class 1, class 2 and class 3. We use six attributes to discriminate for each instance the correct class. In particular, the attributes involved in this example are: Alcohol, Malic acid, Ash, OD280/OD315 of diluted wines (OD), Color intensity (CI) and Proline. We use the method of max-min values to generate a model of interval numbers. In particular, for a fixed attribute, we study the interval of variability in a single class, and then we intersect the intervals of more classes. The model of interval numbers is shown in Table 3.

Class
Alcohol Malic Acid Ash OD CI Proline Suppose the selected instance is (13.860, 1.5100, 2.6700, 3.1600, 3.3800, 410). From the dataset, we know that the selected instance belongs to class 2, and our purpose is to classify it in the right way. We generate six BPAs, one for each attribute, by using a method based on the similarity of interval numbers proposed by Kang et al. [32]. Given two intervals A = [a 1 , a 2 ] and B = [b 1 , b 2 ], their similarity S(A, B) can be defined as where α > 0 is the coefficient of support, here we use α = 5, and D(A, B) is the distance of intervals A and B defined in [33] as For each attribute, we can get seven values of similarity by choosing as A the intervals given in Table 3 and as B the corresponding singleton of the selected instance. Then, by normalizing the obtained values, we get six BPAs, as reported in Table 4. Without any additional information, we can evaluate a final BPA giving the same weight to each attribute, i.e., by summing the six values related to a focal element and then dividing by six. In this way, we get the final BPA shown in Table 5. Now, based on the BPA in Table 5, we can evaluate the PPT (5) of the classes, and we get PPT(1) = 0.3500, PPT(2) = 0.3464, PPT(3) = 0.3036.
Hence, the focal element with the highest PPT is class 1, and so, it would be our final hypothesis without making the correct decision.
We try to improve the described method by using FDEn. Let us fix the value q = 0.6. We evaluate the FDEn of BPAs given in Table 4 and we obtain the results shown in Table 6. Since a higher value of FDEn means a higher uncertainty, we can give more weight to the attributes with lower FDEn. In particular, we define the weights by normalizing to 1 the reciprocal values of fractional Deng entropies. We obtain the weights presented in Table 7. Based on the weights in Table 7, we get a weighted version of the final BPA, as shown in Table 8. Finally, based on the BPA in Table 8 Hence, the focal element with the highest PPT is class 2, so it is our final hypothesis and we made the correct decision. Along the same lines, we can use FDEx. In Table 9, we give the recognition rates of the non-weighted method and methods based on FDEn and FDEx for different choices of q.

Conclusions
In this paper, fractional Deng entropy and extropy have been defined from the definitions of Deng entropy and extropy. These measures have been compared with other well-known ones, and some examples have been proposed. Characterization results for the maximum fractional Deng entropy and extropy have been given, and finally, a problem of classification based on a dataset has been discussed in order to emphasize the relevance of these measures in pattern recognition.
Author Contributions: The authors contributed equally to this paper working together to conceptualize and apply their new definitions. All authors have read and agreed to the published version of the manuscript. Data Availability Statement: Publicly available datasets were analyzed in this study. This data can be found here: http://archive.ics.uci.edu/ml (accessed on: 18 April 2021).

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: