A Novel Belief Entropy for Measuring Uncertainty in Dempster-Shafer Evidence Theory Framework Based on Plausibility Transformation and Weighted Hartley Entropy

Dempster-Shafer evidence theory (DST) has shown its great advantages to tackle uncertainty in a wide variety of applications. However, how to quantify the information-based uncertainty of basic probability assignment (BPA) with belief entropy in DST framework is still an open issue. The main work of this study is to define a new belief entropy for measuring uncertainty of BPA. The proposed belief entropy has two components. The first component is based on the summation of the probability mass function (PMF) of single events contained in each BPA, which are obtained using plausibility transformation. The second component is the same as the weighted Hartley entropy. The two components could effectively measure the discord uncertainty and non-specificity uncertainty found in DST framework, respectively. The proposed belief entropy is proved to satisfy the majority of the desired properties for an uncertainty measure in DST framework. In addition, when BPA is probability distribution, the proposed method could degrade to Shannon entropy. The feasibility and superiority of the new belief entropy is verified according to the results of numerical experiments.

Decision making in the framework of DST is based on the combination results of BPAs. Nonetheless, how to measure the uncertainty of BPA is still an open issue, which has not been completely solved [27]. The uncertainty of BPA mainly contains discord uncertainty and non-specificity uncertainty. Working out the uncertainty of BPA is the groundwork and precondition of applying
There are two functions associated with each BPA called belief function Bel(A) and plausibility function Pl(A), respectively. The two functions are defined as follows: The plausibility function Pl(A) denotes the degree of BPA that potentially supports A, while the belief function Bel(A) denotes the degree of BPA that definitely supports A. Thus, Bel(A) and Pl(A) could be seen as the lower and upper probability of A.
Suppose m 1 and m 2 are two independent BPAs in the same FOD Θ, and they can be combined by using the Dempster-Shafer combination rule as follows: with k = ∑ B∩C=∅ m 1 (B) m 2 (C), (5) where the k is the conflict coefficient to measure the degree of conflict among BPAs. The operator ⊕ denotes the Dempster-Shafer combination rule. Please note that the Dempster-Shafer combination rule is unavailable for combining BPAs such that k > 0.

Probability Transformation
There are many ways to transform a BPA m to a PMF. Here, the pignistic transformation and the plausibility transformation are introduced.
Let m be a BPA on FOD Θ. Its associated probabilistic expression of PMF on Θ is defined as follows: where the |A| is the cardinality of A. The transformation between m and BetP(x) is called the pignistic transformation.
Pt(x) is a probabilistic expression of PMF that is obtained from m by using plausibility transformation as follows: where the Pl(x) is the plausibility function of specific element x in Θ. The transformation between m and Pt(x) is called the plausibility transformation.

Shannon Entropy
Let Ω be a FOD with possible values {w 1 , w 2 , . . . , w n }. The Shannon entropy is explicitly defined as: where the p (w i ) is the probability of alternative w i , which satisfies ∑ n i=1 p (w i ) = 1. If some p (w i ) = 0, we follow the convention that p (w i ) log 2 1 p(w i ) = 0 as lim x→0 + xlog 2 (x) = 0. Please note that we will simply use log for log 2 in the rest of this paper.

Desired Properties of Uncertainty Measures in The DS Theory
In the research of Klir and Wierman [32], Klir and Lewis [45], and Klir [46], five basic required properties are defined for uncertainty measure in DST framework, namely probabilistic consistency, set consistency, range, sub-additivity, and additivity. These requirements are detailed as follows.

•
Probability consistency. Let m be a BPA on FOD X. If m is a Bayesian BPA, then H(m) = ∑ x∈X m(x) log 1 m(x) .
• Additivity. Let m X and m Y be distinct BPAs for FOD X and FOD Y, respectively. The combined BPA m X ⊕ m Y using Dempster-Shafer combination rules must satisfy the following equality: • Sub-additivity. Let m be a BPA on the space X × Y, with marginal BPAs m ↓X and m ↓Y on FOD X and FOD Y, respectively. The uncertainty measure must satisfy the following inequality: • Set consistency. Let m be a BPA on FOD X. If there exists a focal element A ∈ X and m(A) = 1, then an uncertainty measure must degrade to Hartley measure: • Range. Let m be a BPA on FOD X. The range of an uncertainty measure H(m) must be [0, log |X|].
These properties illuminated in DST framework start from the verification by Shannon entropy in PT. In DST, there exist more situations of uncertainty than in PT framework [47]. Therewith, by analyzing shortcomings of these properties, Jiroušek and Shenoy add four other desired properties for measuring uncertainty in DST framework, including consistency with DST semantics, non-negativity, maximum entropy, monotonicity [41].
The uncertainty measure for BPA in DST must agree on the DST semantics [48]. Many uncertainty measures are based on the PMFs which are transformed from BPA [49][50][51]. However, only the plausibility transformation is compatible with the Dempster-Shafer combination rule [41,44]. Therefore, the property of consistency with DST semantics is presented to require the uncertainty measure to satisfy the tenets in DST framework.

•
Consistency with DST semantics. Let m 1 and m 2 be two BPAs in the same FOD. If an uncertainty measure is based on a probability transformation of BPA, which transforms a BPA m to a PMF P m , then the PMFs of m 1 and m 2 must satisfy the following condition: where ⊗ denotes the Bayesian combination rule [41], i.e., pointwise multiplication followed by normalization. Notice that this property is not presupposing the use of probability transformation in the uncertainty measure.
The property of additivity is easy to satisfy by most definitions of uncertainty measure [41]. The property of consistency with DST semantics is regarded as reinforcement of the additivity property, which makes sure that any uncertainty measure in DST framework follows the Dempster-Shafer combination rule.
Since the number of uncertainty type in DST framework is larger than that in PT framework. One can find that uncertainty measures in DST framework prefer a wider range than that in PT framework, namely [0, log |X|]. Thus, in Jiroušek and Shenoy's opinion, the properties of non-negativity, maximum entropy, and monotonicity are pivotal to uncertainty measure in DST framework.
• Non-negativity. Let m be a BPA on FOD X. The uncertainty measure H(m) must satisfy the following inequality: where the equality holds up if and only if m is Bayesian and m {(x)} = 1 with x ∈ X. • Maximum entropy. Let m be a BPA on FOD X. The vacuous BPA m v should have the most uncertainty, then the uncertainty measure must satisfy the following inequality: where the equality holds up if and only if m = m v .
Let v X and v Y be the vacuous BPAs of FOD X and FOD Y, respectively. If |X| < |Y|, The property of set consistency entails that the uncertainty of a vacuous BPA m v for FOD X is log |X|. The probability consistency entails that the uncertainty of a Bayesian BPA m e , which has the equally likely probabilities for X, is log |X| too. However, these two requirements are contradictory as the property of maximum entropy consider H(m v ) > H(m e ). About this contradiction, there is a debatable open issue. Some researchers suggest the uncertainty of these two kinds of BPA should be equal and be the maximum possible uncertainty as we cannot get information to help us make a determinate decision [52,53]. Some other researchers deem the uncertainty of a vacuous BPA to be greater than a Bayesian uniform BPA, which is demonstrated by Ellsberg paradox phenomenon [54][55][56]. To provide a comprehensive understanding for our definition of uncertainty measure, all the above-mentioned properties are taken into account.

The Existing Definitions of Belief Entropy of BPAs
The majority of the uncertainty measures have the Shannon entropy as the start point, which plays an important role to address the uncertainty in PT framework. Nevertheless, the Shannon entropy has inherent limitations to handle the uncertainty in DST as there are more types of uncertainty [27,57]. This is reasonable because the BPA includes more information than probabilistic distribution [4]. In the earlier literatures, the definitions of belief entropy only focus on one aspect of discord uncertainty or non-specificity uncertainty in the BPAs. Then, Yager makes a contribution to distinction between the discord uncertainty and non-specificity uncertainty [35]. Thereafter, the discord and non-specificity are taken into consideration in most of the definitions of belief entropy. Some representative belief entropies and their definitions are listed as follows: Höhel. One of the earliest uncertainty measures in DST is presented by Höhel as shown [34]: where the Bel(A) is the belief function of proposition A. H o (m) only considers the discord uncertainty measure.
Nguyen defines the belief entropy of BPA m using the original BPAs [58]: As the definition of H o (m), H n (m) only captures the discord part of uncertainty. Dubois and Prade define the belief entropy using the cardinality of BPAs [33]:

H d (m) considers only the non-specificity portion of the uncertainty. Dubois and Prade's definition could be regarded as the weighted Hartley entropy
Pal et al. define a belief entropy as [59]: In H p (m), the first component is the measure of discord uncertainty, and the second component is the measure of non-specificity uncertainty.
Jousselme et al. define a belief entropy based on the pignistic transformation [38]: where the BetP(x) is the PMF of pignistic transformation. The H j (m) using the Shannon entropy of BetP(x) Deng defines a belief entropy, namely Deng entropy, as follows [39]: The H deng (m) is very similar to the definition of H p (m), while H deng (m) employs the 2 |A| − 1 instead of |A| to measure the non-specificity uncertainty of the BPA.
Pan and Deng develop Deng entropy H deng (m) with the definition [40]: It is obvious that all these uncertainty measures are the extension of the Shannon entropy in DST. Apart from the aforementioned methods of belief entropy, there are, of course, some other entropy-based uncertainty measures for BPAs in DST framework. One can find an expatiatory and detailed introduction to these methods in the literature [41,47].
Jiroušek and Shenoy define a concept for measuring uncertainty, as follows [41]: The H JS (m) consists of two components. The first part is Shannon entropy of a PMF based on the plausibility transformation, which is associated with discord uncertainty. The second part is the entropy of Dubois and Prade for measuring non-specificity in BPAs. The H JS (m) satisfies the six desired properties, including consistency with DST semantics, non-negativity, maximum entropy, monotonicity, probability consistency, and additivity. Moreover, the properties of range and set consistency are expanded.

The Proposed Belief Entropy
Although the H JS (m) can better meet the requirement of the basic properties for uncertainty measure, it has an intrinsic defect. The first part in H JS (m) using Shannon entropy captures only the probability of plausibility transformation, which may lead to information loss. As argued in H pd (m), the probability interval [Bel(A), Pl(A)] can provide more information according to the BPAs in each proposition. However, the H pd (m) considers only the numerical average of the probability interval, which lacks the piratical physical significance. In this study, by combining the merit of H JS (m) and H pd (m), a new definition of belief entropy-based uncertainty measure in DST framework is proposed as follows: where the Pm(A) = ∑ x∈A Pt(x) is the summation of plausibility transformation-based PMFs of x contained in A. Similar to most of the belief entropies, the first component ∑ A∈2 X m(A)log Pm −1 (A) in H PQ (m) is designed to measure the discord uncertainty of BPA. The information contained in not only BPAs but also the plausibility function based on Pt(x) is taken into consideration. Since the Pt(x) reflects the support degree of different propositions to element x, it could provide more information than m(A). Furthermore, the Pm(A) = ∑ x∈A Pt(x) satisfies the Bel(A) ≤ Pm(A) ≤ Pl(A), which could be seen as a representative of the probability interval. At length, the second component ∑ A∈2 X m(A)log (|m(A)|) in H PQ is the same as the H d (m) to measure the non-specificity uncertainty of BPA. Therefore, we believe that the new proposed belief entropy can be more effective to measure the uncertainty of BPAs in DST framework. The property analysis of H PQ (m) is explored as follows.
(1) Consistency with DST semantics. The first part in H PQ (m) uses Pt(x) based on the plausibility transformation, which is compatible with the definition of the property. The second part is not a Shannon entropy based on probability transformation. Thus, H PQ (m) satisfies the consistency with DST semantics property.
(2) Non-negativity. As  (8) Additivity. Let m X and m Y be two BPAs of FOD X and FOD Y, respectively, A ⊆ 2 X , and B ⊆ 2 Y . Let C = A × B be the corresponding joint focal element on X × Y, x ∈ X, and y ∈ Y. Let m be a joint BPA defined on X × Y which is obtained by using Equation (10).
. Then the new belief entropy for m is: .
As proved in [33], Consequently, Hence, the H PQ (m) satisfies the additivity property. In summary, the new belief entropy H PQ (m) for uncertainty measure in DST framework satisfies the properties of consistency with DST semantics, non-negativity, set consistency, probability consistency, additivity, monotonicity, and does not satisfy the properties of sub-additivity, maximum entropy, range. An overview of the properties of existing belief entropies for uncertainty measure are listed in Table 1. Additionally, based on combining the advantages of the definition of Jiroušek-Shenoy and Pan-Deng, the new belief entropy involves more information, which can better meet the requirements. The properties of maximum entropy and range that the new belief entropy dissatisfies need further discussion. For maximum entropy properties, we think that the uncertainty of a vacuous BPA and an equally likely Bayesian BPA should be equivalent. There is a classical example.
Assume a bet on a race conducted by four cars, A, B, C, and D. Two experts give their opinion. Expert-1 suggests that the ability of the four drivers and the performance of the four cars are almost the same. Expert-2 has no idea about the traits of each car and driver. The opinion of the Expert-1 could be regarded as a uniform probability distribution with m(A) = m(B) = m(C) = m(D) = 1 4 . while the Expert-2 produces a vacuous BPA with m(A, B, C, D) = 1. Based on only one piece of these two pieces of evidence, we have no information to support us to make a certain bet. Besides, it is very convincing that the range property is not suitable for uncertain measure. The range [0, log (|X|)] can only reflect one aspect of uncertainty, which lacks consideration for multiple uncertainties of a BPA in DST framework. As a consequence, the properties of maximum entropy and range should be extended.

Numerical Experiment
In this section, several numerical experiments are verified to demonstrate the reasonability and effectiveness of our proposed new belief entropy.

Example 1
Let Θ = {x} be the FOD. Given a BPA with m(x) = 1, we can obtain the Pt(x) and Pm(x) with: Then, the associated Shannon entropy H s (m) and the proposed belief entropy H PQ (m) are calculated as follows: Obviously, the above example shows that the Shannon entropy and the proposed belief entropy are equal when the FOD has only one single element, where exits no uncertainty.

Example 2
As shown above, the proposed belief entropy is the same as the Shannon entropy when the BPA is the probability distribution. Sections 5.1 and 5.2 verify that the proposed belief entropy will degenerate into the Shannon entropy when the belief is assigned to singleton elements.
Compared to Section 5.2, we know that the uncertainty of this example is the same as the Section 5.2. This is reasonable. As discussed in Section 4.2, neither the uniform BPA nor the vacuous BPA in the same FOD could provide more information for a determinate single element. Thus, their uncertainty should be equal.

Example 4
Two experiments in [40] are recalled in this example. Let Θ= {x 1 , x 2 , x 3 , x 4 } be the FOD. Two BPAs are given as m 1 and m 2 . The detailed BPAs are: The corresponding H PQ (m 1 ) and H PQ (m 2 ) are calculated as follows: It can be seen from the results of H PQ (m 1 ) and H PQ (m 2 ), the belief entropy of m 1 is larger than the m 2 . This is logical because the m 1 (x 1 , x 2 , x 3 ) = 1 6 has one more single element than m 2 (x 1 , x 2 ) = 1 6 , which implies that the m 1 (x 1 , x 2 , x 3 ) contains more information. Thus, the m 1 should be more uncertain.

Example 5
Consider a target recognition problem in [60]. Target detection results provided by two independent sensors. Let A, B, C, and D be the potential target types. The results are represented by BPAs shown as follows.
Though the two BPAs have the same value, the BPA m 1 has four potential targets, namely A, B, C, D, while the BPA m 2 has just three potential targets, namely A, B, C. As verified in [60], it is intuitively expected that m 1 has a larger uncertainty than m 2 . According to the above calculation results, the H deng (m) illustrates that the two BPAs have the same uncertainty, and the H pd (m) suggests that the m 2 has a larger uncertainty. Therefore, both H deng (m) and H pd (m) are unable to reflect the prospective difference. The proposed belief entropy can effectively quantify this divergence by considering not only the information contained in each focal element but also the mutual support degree among different focal elements. Therefore, it is safe to say that the capability of the proposed belief entropy H PQ (m) is unavailable in the H deng (m) and H pd (m).
According to the Jiroušek-Shenoy entropy H JS (m) in Equation (23) and the proposed belief entropy H PQ (m) in Equation (24), both kinds of entropy consists of the discord uncertainty measure and the non-specificity uncertainty measure. Then the H JS (m) and H PQ (m) are calculated as follows.     As shown in Figure 1, it is obvious that the uncertain degree measured by the George and Pal's conflict measure is almost unchanged when the element number increase in proposition A. Similarly, the Höhle's confusion uncertainty measure and Yager's dissonance uncertainty measure have the same situation to reflect the variation on uncertain degree in this case. Thus, these three uncertainty measures cannot detect the change in proposition A. Although the uncertainty degrees obtained by the Klir and Ramer's discord uncertainty measure and Klir and Parviz's strife uncertainty measure change with the growth of element number in A, the variation trends of both methods are contrary to expectation that the uncertainty degree increases with the augment of the element number in A. These methods only measure the discord uncertainty of the BPAs, but ignore the non-specificity uncertainty of the BPAs. Besides, from the Table 2, we can find that the Yager's dissonance uncertainty measure has the minimum uncertainty degree. This is because this method uses the plausibility function to measure the discord uncertainty. The plausibility function contains all the support degree to the single events from other propositions, which could lead to information redundant and uncertainty reducing incorrectly. To sum up, the uncertainty degree obtained by George and Pal's method, Höhle's method, Yager's method, Klir and Ramer's method, and Klir and Parviz's method are unreasonable and counterintuitive, which means these methods cannot measure the uncertainty in this case aright.
From Figure 2, it can be seen that the uncertainty degrees measured by Dubois and Prade's weighted Hartley entropy, Pan and Deng's uncertainty measure, Jiroušek and Shenoy's uncertainty measure, and the proposed belief entropy are increasing visibly with the rising of the element number in A. These methods consider not only the discord uncertainty but also the non-specificity uncertainty. Furthermore, Pan and Deng's uncertainty measure is the largest among all the methods in Table 2. This is understandable. The non-specificity uncertainty measure in Pan and Deng's method is exponential, while the others are linear. As the number of elements in A increases, the uncertainty degree of Pan and Deng's method increases faster than the other methods. Non-specificity uncertainty measure using exponential form may cause the possible uncertainty degree from the discord part to be significantly smaller than the ones from the non-specificity part. Additionally, Jiroušek and Shenoy's uncertainty measure is larger than the proposed belief entropy. Compared to the Jiroušek and Shenoy's uncertainty measure, which uses the probability distribution of single element obtained by plausibility transformation to measure the discord uncertainty, the proposed belief entropy measure that one by using the information of each mass function and the single element each BPA contains. The redundant information is removed, and the possible values of discord uncertainty is decreased notably in the proposed method. More importantly, except for the proposed method, the other three uncertainty measures have shortcomings. The Dubois and Prade's weighted Hartley entropy does not consider the discord uncertainty of BPAs. The Pan and Deng's uncertainty measure cannot measure accurately two similar BPAs in Section 5.5. The discord uncertainty measure of Jiroušek and Shenoy's uncertainty measure is irrational in Section 5.6. Thus, the proposed belief entropy is the only effective approach for uncertainty measure among these given methods in this case. Therefore, the proposed belief entropy, which considers the information contained in BPAs and single elements, is reasonable and effective for uncertainty measure in Dempster-Shafer framework.

Conclusions
How to measure the uncertainty of BPA in the framework of DST is an open issue. In this study, the main contribution is that a new belief entropy is proposed to quantify the uncertainty of BPA. The proposed belief entropy is comprised of the discord uncertainty measurement and the non-specificity uncertainty measurement. In particular, in the discord uncertainty measure component, the idea of probability interval and conversion BPA to probability using the plausibility transformation are combined. The new method takes advantage of the information of not only the BPAs, but also the total support degree of the single events contained in the BPAs. By addressing appropriate information in a BPA, which means less information loss and less information redundancy, the proposed belief entropy could measure the uncertainty of BPA efficiently. In addition, the proposed belief entropy could satisfy six desired properties of consistency with DST semantics, non-negativity, set consistency, probability consistency, additivity, and monotonicity. The results of numerical experiments demonstrate that the proposed belief entropy can be more effective and accurate when compared to the existing uncertainty measures in the framework of DST. Future work of this study will be focused on extending the proposed method to open-world assumptions and applying it to solve problems in real applications.