Next Article in Journal
Modelling of Behavior for Inhibition Corrosion of Bronze Using Artificial Neural Network (ANN)
Next Article in Special Issue
Approach to Evaluating Accounting Informatization Based on Entropy in Intuitionistic Fuzzy Environment
Previous Article in Journal
Sparse-Aware Bias-Compensated Adaptive Filtering Algorithms Using the Maximum Correntropy Criterion for Sparse System Identification with Noisy Input
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Numerical Analysis of Consensus Measures within Groups

1
Department of Information Management, Yuan Ze University, Taoyuan 32003, Taiwan
2
Innovation Center for Big Data and Digital Convergence, Yuan Ze University, Taoyuan 32003, Taiwan
Entropy 2018, 20(6), 408; https://doi.org/10.3390/e20060408
Submission received: 24 April 2018 / Revised: 17 May 2018 / Accepted: 22 May 2018 / Published: 25 May 2018

Abstract

:
Measuring the consensus for a group of ordinal-type responses is of practical importance in decision making. Many consensus measures appear in the literature, but they sometimes provide inconsistent results. Therefore, it is crucial to compare these consensus measures, and analyze their relationships. In this study, we targeted five consensus measures: Φ e (from entropy), Φ 1 (from absolute deviation), Φ 2 (from variance), Φ 3 (from skewness), and Φ m v (from conditional probability). We generated 316,251 probability distributions, and analyzed the relationships among their consensus values. Our results showed that Φ 1 ,   Φ e ,   Φ 2 , and Φ 3 tended to provide consistent results, and the ordering Φ 1 Φ e Φ 2 Φ 3 held at a high probability. Although Φ m v had a positive correlation with Φ 1 ,   Φ e ,   Φ 2 , and Φ 3 , it had a much lower tolerance for even a small proportion of extreme opposite opinions than Φ 1 ,   Φ e ,   Φ 2 , and Φ 3 did.

1. Introduction

A consensus measure quantifies the consensus in ratings of a target. It provides fundamental implications of the group’s decision. For example, it can reveal whether the opinions of the group’s members are converging during a successive voting process [1], or whether averaging the members’ ratings to the group level is appropriate [2]. Because of its practicality, the problem of measuring consensus has received much attention, both in academic and applied research [3].
Many consensus measures appear in the literature. Most of them are derived from the deviation of individual ratings from the mean [3,4], while some are based on the extension of entropy [1], or the application of conditional probability [5]. Because consensus measures intend to quantify consensus, one tends to assume that similar conclusions can be drawn using different consensus measures. Although this assumption usually holds, it is still possible that a set of ratings which receives the lowest consensus score using one consensus measure may get a very high consensus score using another consensus measure (see Table 9). It is reasonable that using different consensus measures might lead to different conclusions because they are built on different theoretical concepts. For example, let A1 and A2 denote two sets of ratings collected at time t1 and t2, t1 < t2. Using one consensus measure might conclude that the consensus of A1 is smaller than that of A2 (i.e., the group members’ opinions are converging), but using another consensus measure might yield the opposite conclusion. Therefore, it is crucial to compare these consensus measures in more detail so that one can adequately interpret the meanings of the consensus values.
The objective of this study was to analyze the relationships among different consensus measures so that one can adequately utilize these consensus measures going forward. We first reviewed five consensus measures, and their properties. Then, we took a numerical analysis approach to comparing these consensus measures. This approach proceeded by generating a large number of possible rating distributions, and calculating their consensus scores using each consensus measure. Then, these consensus scores were analyzed to reveal the relationships among these consensus measures. Finally, we discussed how to interpret these consensus scores, and how to select a suitable consensus measure.

2. Review of Consensus Measures

2.1. Basic Properties of a Consensus Measure

In this paper, we assumed that a rating was an integer in X = { 1 ,   2 ,   , n } . For Likert-type scale responses, n = 5 or 7 is often used. Then, the ratings of all group members can be described as a probability distribution p ( x ) over X. Let p i denote the probability p ( x = i ) of getting a rating i. Then,
p i 0 ,   for   i   =   1   to   n ,
i = 1 n p i = 1 ,
mean   m ( p ) = i = 1 n i p i ,
variance   V ( p ) = i = 1 n p i ( i m ( p ) ) 2 .
Notably, the rating data are ordinal, and thus, calculating the mean or variance of p ( x ) is inappropriate. However, mean, variance, or a combination of both was used intensively in the literature to design consensus measures for ordinal attributes.
Let Φ denote a consensus measure, and Φ ( p ) . denote the consensus score of p ( x ) , based on Φ . It is common to restrict the range of Φ ( p ) between zero and one. This restriction also facilitates comparing different consensus measures. Thus, 0 Φ ( p ) 1 , and Φ ( p ) = 1 and Φ ( p ) = 0 indicate the maximum and minimum consensus scores, respectively [5]. In this paper, we divided the consensus measures into three categories, as described in the three subsections below.

2.2. Deviation-Based Consensus Measures

Deviation-based consensus measures use the absolute deviation of individual ratings from their mean to measure the consensus. They mainly differ in the power of the absolute deviation. In the literature, power = 1 or 2 was used to measure consensus. In this study, we extended the power to 3.
The average deviation ( A D ) [6] is the average difference between each rating and the mean, as shown in Equation (5). It is a measure of variability, and its range is between 0 and n 1 2 , as proven in Corollary 1. Based on A D , we can design a consensus measure Φ 1 ( p ) such that 0 Φ 1 ( p ) 1 (see Definition 1).
A D ( p ) =   i = 1 n p i | i m ( p ) | .
Corollary 1.
Given a probability distribution p ( x ) over X = { 1 , 2 , ... , n } , 0 A D ( p ) n 1 2 holds.
Proof. 
See Appendix A. ☐
Definition 1.
Consensus measure Φ 1 ( p ) = 1 A D ( p ) ( n 1 ) / 2 .
Similar to A D , variance ( V ) is also a measure of variability, and is defined as the average of the squared difference between each rating and the mean, as shown in Equation (4). Its range is between 0 and ( n 1 2 ) 2 , as proven in Corollary 2. Elzinga et al. [4] designed a consensus measure Φ 2 ( p ) based on V (see Definition 2).
Corollary 2.
Given a probability distribution p ( x ) over X = { 1 , 2 , ... , n } , 0 V ( p ) ( n 1 2 ) 2 holds.
Proof. 
See Appendix B. ☐
Definition 2.
Consensus measure Φ 2 ( p ) = 1 V ( p ) ( ( n 1 ) / 2 ) 2 [4].
Notably, A D uses the absolute difference between each rating and the mean, while variance uses the squared difference between each rating and the mean. We can raise the power of the absolute difference to three, and design a new consensus measure Φ 3 ( p ) as follows: let S denote the average of the cubed absolute difference between each rating and the mean, as shown in Equation (6). The range of S is between 0 and ( n 1 2 ) 3 , as proven in Corollary 3. A consensus measure Φ 3 ( p ) based on S is shown in Definition 3.
S ( p ) =   i = 1 n p i | i m | 3 .
Corollary 3.
Given a probability distribution p ( x ) over X = { 1 ,   2 ,     ,     n } , 0 S ( p ) ( n 1 2 ) 3 holds.
Proof. 
See Appendix C. ☐
Definition 3.
Consensus measure Φ 3 ( p ) = 1 S ( p ) ( ( n 1 ) / 2 ) 3 .
The maximum values of Φ 1 ( p ) , Φ 2 ( p ) , and Φ 3 ( p ) all occur when p k = 1 for some k X and p i X \ { k } = 0 . The minimum values of Φ 1 ( p ) , Φ 2 ( p ) , and Φ 3 ( p ) all occur when p 1 = p n = 0.5 , and p i X \ { 1 , n } = 0 . Please see the proofs of Corollaries 1, 2, and 3 in Appendix A, Appendix B, and Appendix C, respectively, for details.
Essentially, in Φ 1 ( p ) , Φ 2 ( p ) , and Φ 3 ( p ) , raising the power of the absolute deviation increases the impact of those ratings further from the mean. An example is given below.
Example 1.
Given a probability distribution p ( x ) over X = { 1 ,   2 ,   3 ,   4 ,   5 } where p i X = 0.2 , a (less consensus) probability distribution q ( x ) with more probabilities further from the mean is generated from p ( x ) by shifting 0.05 probability at x = 4 to x = 5 , i.e., q 1 = q 2 = q 3 = 0.2 , q 4 = 0.15 , and q 5 = 0.25 . Table 1 shows A D ,   V ,   S , Φ 1 , Φ 2 , and Φ 3 of p ( x ) and q ( x ) . The last row of Table 1 indicates that from p to q , the consensus is reduced by 0.03 with Φ 1 , 0.03688 with Φ 2 , and 0.04211 with Φ 3 . That is, the impact of increasing the probability further from the mean is greatest in Φ 3 , less in Φ 2 , and least in Φ 1 .

2.3. Conditional-Probability-Based Consensus Measure

Corollary 2 shows that the range of variance V is between 0 and ( n 1 2 ) 2 , and the consensus measure Φ 2 is constructed based on this range. However, the range of V is a function of the mean m . Specifically, for a given value of m , the range of V is between ( m m ) ( m + 1 m ) and ( m 1 ) ( n m ) , where m is the greatest integer n . The size of this range is small as the value of m approaches either end of the interval [ 1 , n ] , and is large as the value of m approaches the center of the interval [ 1 , n ] . Thus, Akiyama et al. [5] proposed a new consensus measure via the conditional probability p ( V | m ) . Because this consensus measure is calculated using both m and V , we denoted it as Φ m v ( p ) in this paper. Figure 1 shows the steps to calculate Φ m v ( p ) for a probability distribution p ( x ) over X = { 1 ,   2 ,   3 ,   4 ,   5 } .
Table 2 shows some examples of the probability distribution p ( x ) with Φ m v ( p ) = 1 or 0. Unlike Φ 1 , Φ 2 , and Φ 3 , Φ m v ( p ) = 1 not only occurs when p k = 1 for some k X , and p i X \ { k } = 0 , but also occurs in many other cases. The first four examples in Table 2 show that the maximum value of Φ m v ( p ) occurs when all probabilities are distributed on one side, and none on the other side of x . Similarly, Φ m v ( p ) = 0 not only happens when p 1 = p n = 0.5 , and p i X \ { 1 , n } = 0 , but also occurs in many other cases. The last three examples in Table 2 show that a small proportion of extreme opposite opinions can drag Φ m v ( p ) to zero.

2.4. Entropy-Based Consensus Measure

In the literature, the Shannon entropy equation and its extensions were used to quantify the diversity of a probability distribution [7]. Given a probability distribution p ( x ) , the Shannon entropy of p ( x ) is i = 1 n p i ln ( p i ) where n is the number of possible values of x , and p i denotes the probability of x = i . Because diversity appears to be the opposite concept of consensus, and the range of the Shannon entropy is between 0 and ln ( n ) , a consensus measure between 0 and 1 based on the Shannon entropy equation can be defined as follows [1,8]:
Φ = 1 + i = 1 n p i ln ( p i ) ln ( n ) .
Notably, the Shannon entropy equation treats the variable x as a nominal variable, and not as an ordinal variable; thus, the Shannon entropy equation and Equation (7) are inappropriate for quantifying the consensus of ordinal data, such as Likert-type scale responses. To resolve this problem, Tastle and Wierman [1,8] extended the Shannon entropy equation to define a new consensus measure, denoted as Φ e in this paper, as follows:
Φ e = 1 + i = 1 n p i log 2 ( 1 | i m | n 1 ) ,
where m is the mean of p ( x ) , as defined in Equation (3). Similar to Φ 1 ( p ) , Φ 2 ( p ) , and Φ 3 ( p ) , the maximum value of Φ e ( p ) only occurs when p k = 1 for some k X , and p i X \ { k } = 0 ; the minimum value of Φ e ( p ) only occurs when p 1 = p n = 0.5 , and p i X \ { 1 , n } = 0 .

3. Experimental Study

3.1. Experiment Setup

Given a probability distribution, the five consensus measures reviewed in Section 2 often yielded different consensus scores, and sometimes the differences among these scores were substantial, and led to opposite conclusions. This phenomenon makes it difficult to interpret the meaning of these scores. In this study, we performed a numerical experiment to analyze the relationships among these five consensus measures.
This experiment used the probability distribution p ( x ) over X = { 1 ,   2 ,   3 ,   4 ,   5 } , which is common for Likert-type scale data. Specifically, we wrote a small computer program containing a five-level for loop to generate 316,251 probability distributions, where the i-th level of the for loop changed the value of p i from 0 to 1 with a step size of 0.2, and cases not satisfying i = 1 5 p i = 1 were skipped. Thus, these 316,251 probability distributions covered all of the possible probability distributions of p ( x ) satisfying p i { 0 ,   0.2 ,   0.4 , ,   0.98 ,   1 } for i = 1 to 5, and i = 1 5 p i = 1 . Then, the consensus scores of each generated probability distribution were calculated and compared to study the relationships among the five consensus measures. Table 3 shows the distribution of the mean values of the 316,251 probability distributions. Most of the generated probability distributions had mean values between 2 and 4.

3.2. Correlation

Table 4 shows the Kendall rank correlation coefficients between any two consensus measures. As expected, the results reflected higher than 0.887 correlation between any two consensus measures. That is, if a probability distribution A is ranked higher than another probability distribution B based on one consensus measure, it is very likely that A is also ranked higher than B based on another consensus measure. Let τ ( Φ i ,   Φ j ) denote the Kendall rank correlation coefficient between Φ i and Φ j . According to Table 4, the lowest correlation occurred at τ ( Φ 1 ,   Φ 3 ) , and the highest occurs at τ ( Φ 1 ,   Φ e ) . Specifically, τ ( Φ 1 ,   Φ e ) > τ ( Φ 2 ,   Φ e ) > τ ( Φ 3 ,   Φ m v ) > τ ( Φ 2 ,   Φ m v ) > τ ( Φ 2 ,   Φ 3 ) > τ ( Φ 1 ,   Φ 2 ) > τ ( Φ e ,   Φ m v ) > τ ( Φ e ,   Φ 3 ) > τ ( Φ 1 ,   Φ m v ) > τ ( Φ 1 ,   Φ 3 ) .
According to Table 3, only 5.18% and 4.82% of the 316,251 generated probability distributions had their mean values in the intervals [1, 2] and (4, 5], respectively. To check whether high correlation still existed for probability distributions with small or large mean values, we calculated the Kendall rank correlation coefficients using both subsets of probability distributions, and the results are shown in Table 5 and Table 6. Every value in Table 5 and Table 6 was smaller than its corresponding value in Table 4. Particularly, τ ( Φ 1 ,   Φ 3 ) dropped from 0.887252 in Table 4 to 0.774093 in Table 5, and 0.772132 in Table 6; τ ( Φ 1 ,   Φ m v ) dropped from 0.925708 in Table 4 to 0.785614 in Table 5, and 0.776873 in Table 6.

3.3. Range of Difference

Although Table 4 shows that a positive correlation existed between any two consensus measures of the 316,251 generated probability distributions, some of the generated probability distributions did not follow this general trend. In this section, we calculated the range of differences between two consensus measures to show that this difference was usually small, but was sometimes very big.
Table 7 shows the mean differences between any two consensus measures of the 316,251 generated probability distributions. All of the mean differences were small (<0.167), where the largest mean difference occurred between Φ 1 and Φ 3 , and the smallest mean difference occurred between Φ 1 and Φ e . The results were consistent with Table 4, where the smallest and the largest correlation coefficients were R ( Φ 1 ,   Φ 3 ) and R ( Φ 1 ,   Φ e ) , respectively.
Table 8 shows the maximum difference between any two consensus measures of the 316,251 generated probability distributions. Some of the maximum differences were very large. For example, the maximum difference between Φ m v and other consensus measures was larger than 0.84. Notably, all of the correlation coefficients between Φ m v and the other consensus measures were greater than 0.92 (see Table 4), and the mean difference between Φ m v and the other consensus measures was less than 0.16 (see Table 7). Thus, it is reasonable to infer that, although for most probability distributions, the difference between Φ m v and the other consensus measures was not large, but for some probability distributions, this difference could be huge. Therefore, it is important to understand for which kinds of probability distributions does such a big difference between various consensus measures occur.
The first four examples in Table 9 show some of the generated probability distributions where the maximum differences between two consensus measures occurred. Example 1 had a large proportion (98%) of probability at x = 1 , thus rendering high consensus scores using Φ 1 ,   Φ e , Φ 2 , and Φ 3 . However, this large proportion of probability at x = 1 also made values of m close to 1, where m was the mean of the probability distribution. As discussed in Section 2.3, the range of variance is small when m approaches either end of the interval [0, 1]. Thus, for values of m close to 1, the range of variance was small, making Φ m v very sensitive to even a small proportion of probability at the opposite end of x (2% at x = 5 in this example). As a result, Example 1 yielded Φ m v = 0 . This example was also one of the probability distributions among the 316,251 generated probability distributions that had the maximum difference (in Table 8) between Φ m v and other consensus measures.
Examples 2 and 3 in Table 9 were similar to Example 1, where a large proportion of probability occurred at x = 1 , and a small proportion of probability occurred at x = 5 . The values of Φ m v remained 0 for Examples 2 and 3. However, the difference between p 1 and p 5 decreased from Example 1 through to Example 3, making Φ 1 ,   Φ e , Φ 2 , and Φ 3 smaller for Examples 2 and 3 than for Example 1. Notably, Example 2 was one of the probability distributions that had the maximum difference (in Table 8) between Φ 1 and Φ e ; Example 3 was one of the probability distributions that had the maximum difference between Φ 2 and Φ 3 .
Example 4 had p 3 = p 5 = 0.5 , and yielded the maximum difference (in Table 8) between Φ 1 and Φ 2 , between Φ 1 and Φ 3 , between Φ e and Φ 2 , and between Φ e and Φ 3 . Suppose that the first four examples in Table 9 describe the voting results at four different stages during a successive voting process. From Example 1 through to Example 4, the value of Φ 1 decreased, indicating the group’s consensus was diverging. However, using Φ m v concluded the opposite. For Φ e , Φ 2 , and Φ 3 , the consensus first decreased (from Example 1 through to Example 3), and then increased (from Example 4 onward). However, the differences between the consensus values in Examples 1 and 4 were 0.273596 with Φ e , 0.1716 with Φ 2 , and −0.02565 with Φ 3 . Thus, using different consensus measures could lead to different conclusions.
A small change in the probability distribution could result in a different impact on different consensus measures. Consider Examples 1, 7, and 6. They differed by moving a small proportion (2%) of probability from x = 5 , to x = 4 , and to x = 3 , respectively. Although they were similar probability distributions, the value of Φ m v was 0 in Example 1, and gradually increased to 0.166667 in Example 7, but quickly increased to 0.833333 in Example 6. However, the values of Φ 1 ,   Φ e , Φ 2 , and Φ 3 did not change much among these three examples. Notably, the proportion of probabilities further from the mean had a greater negative impact on Φ 3 , than on Φ 2 and Φ 1 . Thus, by moving 2% of probability from x = 5 to x = 4 (i.e., moving closer to the mean), the ordering of Φ 1 ,   Φ 2 , and Φ 3 changed from Φ 3 < Φ 2 = Φ 1 in Example 1 to Φ 3 < Φ 1 < Φ 2 in Example 7. Then, by moving 2% of probability from x = 4 to x = 5 , the ordering of Φ 1 ,   Φ 2 , and Φ 3 changed to Φ 1 < Φ 2 < Φ 3 in Example 6.
The ordering of the values of these consensus measures depended on the probability distribution. For Examples 4, 5, and 6, the value of Φ m v was the same, but Φ 1 < Φ e < Φ 2 < Φ m v < Φ 3 held in Example 4, Φ e < Φ 1 < Φ m v < Φ 3 < Φ 2 held in Example 5, and Φ m v < Φ 1 < Φ e < Φ 2 < Φ 3 held in Example 6. In Example 7, Φ m v was the smallest among all consensus measures; however, in Example 8, Φ m v was the greatest.

3.4. Ordering

From the examples in Table 9, it appeared that no fixed ordering existed among the consensus scores calculated using different consensus measures. Figure 2 shows the distributions of consensus scores of the 316,251 probability distributions generated in this experiment. The distributions of consensus scores based on Φ 1 ,   Φ e ,   Φ 2 , and Φ 3 were similar, but were very different from the distribution of consensus scores based on Φ m v . For the consensus values close to 1, the ordering of the probabilities among Φ 1 ,   Φ e ,   Φ 2 , and Φ 3 was Φ 1 < Φ e < Φ 2 < Φ 3 , but for the consensus values close to 0, the ordering of the probabilities became Φ 1 Φ e Φ 2 Φ 3 .
In Table 10, we compared the consensus scores of the 316,251 generated probability distributions, and calculated the probabilities of scores based on one consensus measure being less than or equal to scores based on another consensus measure. According to Table 10, Φ 1 Φ 2 and Φ e Φ 2 always held, while Φ 2 Φ 3 , Φ e Φ 3 , Φ 1 Φ 3 , and Φ 1 Φ e also held at very high probabilities. Thus, Φ 1 Φ e Φ 2 Φ 3 was the most probable ordering among the scores based on these four consensus measures. The orderings between Φ m v , and Φ 1 or Φ e were not apparent, where Φ 1 Φ m v and Φ e Φ m v only held at 58.12% and 52.04% probabilities, respectively. Finally, Φ 2 > Φ m v and Φ 3 > Φ m v were likely to occur because Φ 2 Φ m v and Φ 3 Φ m v held at 36.84% and 28.01% probabilities, respectively.

3.5. Relationships

To visually inspect the relationships among different consensus measures, we plotted the consensus values of the 316,251 generated probability distributions in two-dimensional (2D) scatter charts.
Figure 3 shows the scatter charts of Φ 1 scores versus scores based on the other consensus measures, where the red dashed lines represent equality between two consensus scores. As expected, a positively correlated trend existed. No fixed ordering existed between Φ 1 and the other consensus measures except that Φ 1 Φ 2 always held, as shown in Figure 3b. According to Figure 3a–c, as the value of Φ 1 approached 0 or 1, the ranges of Φ e ,   Φ 2 and Φ 3 narrowed, indicating that the maximum differences between Φ 1 and Φ e ,   Φ 2 , and Φ 3 decreased. However, when the value of Φ 1 approached 0.5, the ranges of Φ e ,   Φ 2 , and Φ 3 increased, indicating that the maximum differences between Φ 1 and Φ e ,   Φ 2 , and Φ 3 also increased. Furthermore, the maximum difference between Φ 1 and Φ e was smaller than both the maximum differences between Φ 1 and Φ 2 , and between Φ 1 and Φ 3 .
Figure 3d shows that, for Φ 1 < 1 , as the value of Φ 1 increased, the range of Φ m v increased, and the maximum difference between Φ 1 and Φ m v became huge. For any probability distribution satisfying Φ 1 = 1 , its Φ m v was also 1. However, for any probability distribution satisfying Φ m v = 1 , its value of Φ 1 was not necessarily 1. In fact, there were only n probability distributions satisfying Φ 1 = 1 , that is, when p k = 1 for some k X , and p i X \ { k } = 0 (this statement also applies to Φ e ,   Φ 2 , and Φ 3 ). However, there were many probability distributions satisfying Φ m v = 1 (see Table 2 for examples).
Figure 4 shows the scatter charts of the consensus scores based on Φ e ,   Φ 2 ,   Φ 3 , and Φ m v . No fixed ordering existed among these consensus measures except that Φ e Φ 2 always held, as shown in Figure 4a. According to Figure 4a,b,d, for Φ e ,   Φ 2 , and Φ 3 , as the value of one consensus measure approached either end of the interval [0, 1], the range of another consensus measure decreased. According to Figure 4a,b, the maximum difference between Φ e and Φ 2 was smaller than that between Φ e and Φ 3 . According to Figure 3b and Figure 4a,d, the maximum difference between Φ 2 and Φ e was smaller than those between Φ 2 and Φ 1 , and between Φ 2 and Φ 3 . Figure 4c,e,f show a similar pattern to Figure 3d. As the value of Φ e (or Φ 2 , Φ 3 ) increased (before reaching 1), the range of Φ m v increased, and the maximum difference between Φ e (or Φ 2 and Φ 3 ) and Φ m v became huge.

4. Discussions

Given a probability distribution, using different consensus measures often yields different consensus scores. If there exists a fixed ordering among these scores, then consistent results can be drawn using different consensus measures. Unfortunately, such an ordering depends on the given probability distribution. However, according to Table 10, the following orderings among the consensus scores held at high probabilities: Φ 1 Φ e Φ 2 Φ 3 , Φ 2 > Φ m v , and Φ 3 > Φ m v .
Because there exists no fixed ordering among consensus scores based on different consensus measures, it is crucial to know the relationships among the consensus measures. Figure 3 and Figure 4 revealed that, for Φ 1 ,   Φ e ,   Φ 2 , and Φ 3 , as the value of one consensus measure approached either end of the interval [0, 1], the ranges of the other consensus measures decreased. Thus, one can expect smaller differences among Φ e ,   Φ 1 ,   Φ 2 , and Φ 3 for consensus scores close to 0 or 1, than for consensus scores close to 0.5.
According to Figure 3d and Figure 4c,e,f, the range of Φ m v increased rapidly as the value of Φ e ,   Φ 1 ,   Φ 2 , or Φ 3 increased. Thus, Φ m v often gave results inconsistent with those from Φ e ,   Φ 1 ,   Φ 2 , and Φ 3 , especially when the value of Φ e ,   Φ 1 ,   Φ 2 , or Φ 3 was large. Looking at these figures from another perspective, the ranges of Φ 1 ,   Φ e ,   Φ 2 , and Φ 3 decreased as the value of Φ m v increased. Notably, Φ m v tended to give low scores to probability distributions where some probability was located at the opposite end of the mean. Thus, for values of Φ m v close to zero, one should also check the values of Φ 1 ,   Φ e ,   Φ 2 , and Φ 3 for possibly inconsistent results.
Choosing a consensus measure remains a task for the users. If one has a low tolerance for even a small proportion of extreme opposite opinions, then Φ m v is a good choice. Otherwise, the other consensus measures tend to provide consistent results. If one prefers to emphasize the opinions further from the mean, then Φ 3 is a good choice. Otherwise, either Φ 1 or Φ e can be used, both yielding similar results. Finally, Φ 2 provides a middle ground between Φ 3 and Φ 1 .

5. Conclusions

An understanding of the characteristics of consensus measures helps users interpret results. For example, according to Figure 3b, Φ 1 tended to yield a smaller consensus score than Φ 2 for the same probability distribution; thus, a probability distribution A with Φ 1 ( A ) = 0.6 might have more consensus than another probability distribution B with Φ 2 ( B ) = 0.7 , even though Φ 1 ( A ) < Φ 2 ( B ) .
In essence, two opposite forces shape the design of a consensus measure: the force of obeying the majority, and the force of respecting the minority. Consensus measure Φ e stressed on the former, and the opinion of the minority has a weak impact on the consensus scores. In contrast, Φ m v emphasizes the latter, and the opinion of the minority substantially influences the consensus scores, as shown in the first four examples in Table 9.
Deviation-based consensus measures (i.e., Φ 1 ,   Φ 2 , and Φ 3 ) allow users to adjust the strengths of these two forces. As described in Section 2.2, raising the power of the absolute deviation in the deviation-based consensus measures increases the impact of ratings further from the mean. Intuitively, unless the probabilities of all opinions are distributed evenly on opposite sides of the mean (e.g., p 1 = p n = 0.5), ratings further from the mean represent the opinions of the minority. Thus, going from Φ 1 through to Φ 3 , the impact of the minority increases. Overall, fine-tuning the balance between the force of obeying the majority, and the force of respecting the minority in a consensus measure provides the consensus measure with more flexibility for various situations, and is a direction of research worth exploring.

Funding

This research is supported by the Ministry of Science and Technology, Taiwan, under Grant MOST 106-2221-E-155-038.

Conflicts of Interest

The author declares no conflict of interest.

Appendix A

In this section, we derived the range of A D ( p ) , where p is a probability distribution over X = { 1 , 2 , ... , n } with mean m . Lemma 1 shows that, by moving each p i m gradually toward p 1 , the A D of the resulting distribution keeps increasing. Similarly, Lemma 2 shows that by moving each p i > m gradually toward p n , the A D of the resulting distribution also keeps increasing.
Lemma 1.
Let p ( x ) and q ( x ) be two probability distributions over X = { 1 , 2 , ... , n } , p i and q i denote p ( x = i ) and q ( x = i ) , respectively, and p i < 1 and q i < 1 for each i X . Let m = i = 1 n i p i , and k denote the greatest integer satisfying 1 < k m and p k > 0 . If q k 1 = p k + p k 1 , q k = 0 , and q i = p i for each i X \ { k 1 , k } , then A D ( q ) > A D ( p ) .
Proof. 
By Equation (3), the mean of q ( x ) is
m = ( i = 1 k 2 i p i ) + ( k 1 ) ( p k 1 + p k ) + ( i = k + 1 n i p i ) = ( i = 1 n i p i ) p k = m p k .
Let j denote the smallest integer such that m < j and p j > 0 . Then, p i = 0 for k + 1 i j 1 , and q i = 0 for k i j 1 . Thus,
i = k + 1 j 1 | i m | p i = i = k j 1 | i m | q i = 0 .
Also, 0 < p k < 1 , k m , and m < j yield k 1 m 1 < m < m < j .
A D ( q ) = ( i = 1 k 2 ( m i ) p i ) + ( ( m ( k 1 ) ) ( p k 1 + p k ) + ( i = k j 1 | i m | q i ) + ( i = j n ( i m ) p i ) = ( i = 1 k 2 ( m i ) p i ) + ( ( m ( k 1 ) ) ( p k 1 + p k ) + ( i = k + 1 j 1 | i m | p i ) + ( i = j n ( i m ) p i ) = ( ( i = 1 k 2 ( m i ) p i ) p k ( i = 1 k 2 p i ) ) + ( m p k ( k 1 ) ) ( p k 1 + p k ) + ( i = k + 1 j 1 | i m | p i ) + ( ( i = j n ( i m ) p i ) + p k ( i = j n p i ) ) = ( i = 1 k 2 ( m i ) p i ) p k ( i = 1 k 2 p i ) + ( m ( k 1 ) ) p k 1 + ( m k ) p k + p k p k ( p k 1 + p k ) + ( i = k + 1 j 1 | i m | p i ) + ( i = j n ( i m ) p i ) + p k ( i = j n p i ) = A D ( p ) + p k ( ( i = 1 k 2 p i ) + 1 ( p k 1 + p k ) + ( i = j n p i ) )   = A D ( p ) + p k ( 1 i = 1 k p i + i = j n p i ) = A D ( p ) + 2 p k i = j n p i > A D ( p ) .
 ☐
Lemma 2.
Let p ( x ) and q ( x ) be two probability distributions over X = { 1 , 2 , ... , n } , p i and q i denote p ( x = i ) and q ( x = i ) , respectively, and p i < 1 and q i < 1 for each i X . Let m = i = 1 n i p i , and j denote the smallest integer satisfying m < j < n and p j > 0 . If q j = 0 , q j + 1 = p j + p j + 1 , and q i = p i for each i X \ { j , j + 1 } , then A D ( q ) > A D ( p ) .
Proof. 
By Equation (3), the mean of q ( x ) is
m = ( i = 1 j 1 i p i ) + ( j + 1 ) ( p j + p j + 1 ) + ( i = j + 2 n i p i ) = ( i = 1 n i p i ) + p j = m + p j .
Let k denote the greatest integer such that 1 < k m and p k > 0 . Then, p i = 0 for k + 1 i j 1 , and q i = 0 for k + 1 i j . Thus,
i = k + 1 j 1 | i m | p i = i = k + 1 j | i m | q i = 0 .
Also, 0 < p j < 1 , k m and m < j yield k m < m < m + 1 < j + 1 .
A D ( q ) = ( i = 1 k ( m i ) p i ) + ( i = k + 1 j | i m | q i ) + ( ( j + 1 ) m   ) ( p j + p j + 1 ) + ( i = j + 2 n ( i m ) p i ) = ( i = 1 k ( m i ) p i ) + ( i = k + 1 j 1 | i m | p i ) + ( ( j + 1 ) m   ) ( p j + p j + 1 ) + ( i = j + 2 n ( i m ) p i ) = ( ( i = 1 k ( m i ) p i ) + p j ( i = 1 k p i ) ) + ( i = k + 1 j 1   | i m | p i ) + ( ( j + 1 ) m ) ( p j + p j + 1 ) p j ( p j + p j + 1 ) + ( ( i = j + 2 n ( i m ) p i ) p j ( i = j + 2 n p i ) ) = ( i = 1 k ( m i ) p i ) + p j ( i = 1 k p i ) + ( i = k + 1 j 1   | i m | p i ) + ( j m ) p j + p j + ( ( j + 1 ) m ) p j + 1 p j ( p j + p j + 1 ) + ( i = j + 2 n ( i m ) p i ) p j ( i = j + 2 n p i ) = A D ( p ) + p j ( ( i = 1 k p i ) + 1 ( p j + p j + 1 ) ( i = j + 2 n p i ) ) = A D ( p ) + p j ( ( i = 1 k p i ) + 1 i = j n p i ) = A D ( p ) + 2 p j i = 1 k p i > A D ( p ) .
 ☐
Lemmas 3 and 4 were used to derive the upper bound of A D in Corollary 1.
Lemma 3.
Given a distribution p ( x ) over X = { 1 , 2 , ... , n } , there exists a distribution q ( x ) with q 1 + q n = 1 and q i = 0 for each i X \ { 1 , n } , satisfying A D ( q ) A D ( p ) .
Proof. 
First, consider the trivial case of p i = 1 for some i X . Let q 1 = 1 , then A D ( q ) = A D ( p ) holds, obviously. Next, consider the case of p i < 1 for each i X .
Let m = i = 1 n i p i denote the mean of p ( x ) , k denote the greatest integer satisfying 1 < k m and p k > 0 , and j denote the smallest integer satisfying m < j < n and p j > 0 . We can generate a new distribution q ( x ) by repeatedly applying Lemma 1 to move each p i k gradually toward p 1 , and by repeatedly applying Lemma 2 to move each p i j gradually toward p n . As a result, q 1 = i = 1 k p i , q n = i = j n p i , and q i = 0 for each i X \ { 1 , n } , and A D ( q ) > A D ( p ) . ☐
Lemma 4.
Given a distribution p ( x ) over X = { 1 , 2 , ... , n } . where p 1 + p n = 1 and p i = 0 for each i X \ { 1 , n } , A D ( p ) is maximized when p 1 = p n = 0.5 .
Proof. 
Without loss of generality, let p 1 = 1 2 + δ and p n = 1 2 δ . for some δ 0 . Then, Equation (3) yields m = 1 p 1 + n p n = ( 1 2 + δ ) + n ( 1 2 δ ) = 1 + n 2 + δ ( 1 n ) .
If δ = 0, then p 1 = p n = 0.5 . Use A D 0 to denote the value of A D ( p ) at δ = 0. Then,
A D 0 = p 1 ( m 1 ) + p n ( n m ) = 1 2 ( 1 + n 2 1 ) + 1 2 ( n 1 + n 2 ) = n 1 2 . A D ( p ) = ( 1 2 + δ ) ( m 1 ) + ( 1 2 δ ) ( n m )        = ( 1 2 + δ ) ( 1 + n 2 + δ ( 1 n ) 1 ) + ( 1 2 δ ) ( n 1 + n 2 δ ( 1 n ) ) = ( 1 2 + δ ) ( n 1 2 + δ ( 1 n ) ) + ( 1 2 δ ) ( n 1 2 δ ( 1 n ) ) = 1 2 ( n 1 2 + δ ( 1 n ) ) + δ ( n 1 2 + δ ( 1 n ) ) + 1 2 ( n 1 2 δ ( 1 n ) ) δ ( n 1 2 δ ( 1 n ) ) = n 1 2 2 δ 2 ( n 1 ) A D 0 .
 ☐
Corollary 1.
Given a probability distribution p ( x ) over X = { 1 , 2 , ... , n } , 0 A D ( p ) n 1 2 holds.
Proof. 
The upper bound n 1 2 is the direct result from Lemmas 3 and 4, and occurs when p 1 = p n = 0.5 . The lower bound 0 is by the definition of A D ( p ) in Equation (5), and occurs when p i = 1 for some i X . ☐

Appendix B

In this section, we derived the range of V ( p ) , where p is a probability distribution over X = { 1 , 2 , ... , n } with mean m . The proof follows similar steps to those in Appendix A.
Lemma 5.
Let p ( x ) and q ( x ) be two probability distributions over X = { 1 , 2 , ... , n } , p i and q i denote p ( x = i ) and q ( x = i ) , respectively, and p i < 1 and q i < 1 for each i X . Let m = i = 1 n i p i , and k denote the greatest integer satisfying 1 < k m and p k > 0 . If q k 1 = p k + p k 1 , q k = 0 , and q i = p i for each i X \ { k 1 , k } , then V ( q ) > V ( p ) .
Proof. 
The mean of q ( x ) is m = m p k .
Let j denote the smallest integer such that m < j and p j > 0 . Then, p i = 0 for k + 1 i j 1 , and q i = 0 for k i j 1 . Thus,
i = k + 1 j 1 ( i m ) 2 p i = i = k j 1 ( i m ) 2 q i = 0 .
Also, 0 < p k < 1 , k m and m < j yield k 1 m 1 < m < m < j .
V ( q ) = ( i = 1 k 2 ( i m ) 2 p i ) + ( ( k 1 ) m ) 2 ( p k 1 + p k ) + ( i = k j 1 ( i m ) 2 q i ) + ( i = j n ( i m ) 2 p i ) = ( ( i = 1 k 2 ( i m ) 2 p i ) + 2 p k ( i = 1 k 2 ( i m ) p i ) + p k 2 ( i = 1 k 2 p i ) )         + ( ( k 1 ) m + p k ) 2 ( p k 1 + p k ) + ( i = k + 1 j 1 ( i m ) 2 p i )         + ( ( i = j n ( i m ) 2 p i ) + 2 p k ( i = j n ( i m ) p i ) + p k 2 ( i = j n p i ) ) = ( i = 1 k 2 ( i m ) 2 p i ) + 2 p k ( i = 1 k 2 ( i m ) p i ) + p k 2 ( i = 1 k 2 p i ) + ( ( k 1 ) m + p k ) 2 p k 1         + ( k m + p k 1 ) 2 p k + ( i = k + 1 j 1 ( i m ) 2 p i ) + ( i = j n ( i m ) 2 p i ) + 2 p k ( i = j n ( i m ) p i ) + p k 2 ( i = j n p i ) = ( i = 1 k 2 ( i m ) 2 p i ) + 2 p k ( i = 1 k 2 ( i m ) p i ) + p k 2 ( i = 1 k 2 p i )         + ( ( ( k 1 ) m ) 2 + 2 p k ( ( k 1 ) m ) + p k 2 ) p k 1         + ( ( k m ) 2 + 2 ( k m ) ( p k 1 ) + ( p k 1 ) 2 ) p k + ( i = k + 1 j 1 ( i m ) 2 p i )         + ( i = j n ( i m ) 2 p i ) + 2 p k ( i = j n ( i m ) p i ) + p k 2 ( i = j n p i ) = V ( p ) + 2 p k ( ( i = 1 k 2 ( i m ) p i ) + ( ( k 1 ) m ) p k 1 + ( k m ) ( p k 1 ) p k         + ( i = j n ( i m ) p i ) ) + p k 2 ( ( i = 1 k 2 p i ) + p k 1 + p k + ( i = j n p i ) ) + p k = V ( p ) + 2 p k ( ( i = 1 n ( i m ) p i ) p k ) + p k 2 ( i = 1 n p i ) + p k = V ( p ) 2 p k 2 + p k 2 + p k = V ( p ) + p k ( 1 p k ) > V ( p ) .
 ☐
Lemma 6.
Let p ( x ) and q ( x ) be two probability distributions over X = { 1 , 2 , ... , n } , p i and q i denote p ( x = i ) and q ( x = i ) , respectively, and p i < 1 and q i < 1 for each i X . Let m = i = 1 n i p i , and j denote the smallest integer satisfying m < j < n and p j > 0 . If q j = 0 , q j + 1 = p j + p j + 1 , and q i = p i for each i X \ { j , j + 1 } , then V ( q ) > V ( p ) .
Proof. 
The mean of q ( x ) is m = m + p j .
Let k denote the greatest integer such that 1 < k m and p k > 0 . Then, p i = 0 for k + 1 i j 1 , and q i = 0 for k + 1 i j . Thus,
i = k + 1 j 1 ( i m ) 2 p i = i = k + 1 j ( i m ) 2 q i = 0 .
Also, 0 < p j < 1 , k m and m < j yield k m < m < m + 1 < j + 1 .
V ( q ) = ( i = 1 k ( i m ) 2 p i ) + ( i = k + 1 j ( i m ) 2 q i ) + ( ( j + 1 ) m ) 2 ( p j + p j + 1 )           + ( i = j + 2 n ( i m ) 2 p i ) = ( ( i = 1 k ( i m ) 2 p i ) 2 p j ( i = 1 k ( i m ) p i ) + p j 2 ( i = 1 k p i ) ) + i = k + 1 j 1 ( i m ) 2 p i           + ( ( j + 1 ) m p j ) 2 ( p j + p j + 1 )           + ( ( i = j + 2 n ( i m ) 2 p i ) 2 p j ( i = j + 2 n ( i m ) p i ) + p j 2 ( i = j + 2 n p i ) ) = ( i = 1 k ( i m ) 2 p i ) 2 p j ( i = 1 k ( i m ) p i ) + p j 2 ( i = 1 k p i ) + i = k + 1 j 1 ( i m ) 2 p i           + ( ( j m ) + ( 1 p j ) ) 2 p j + ( ( j + 1 m ) p j ) 2 p j + 1 + ( i = j + 2 n ( i m ) 2 p i )           2 p j ( i = j + 2 n ( i m ) p i ) + p j 2 ( i = j + 2 n p i ) = ( i = 1 k ( i m ) 2 p i ) 2 p j ( i = 1 k ( i m ) p i ) + p j 2 ( i = 1 k p i ) + i = k + 1 j 1 ( i m ) 2 p i           + ( ( j m ) 2 + 2 ( j m ) ( 1 p j ) + ( 1 p j ) 2 ) p j           + ( ( j + 1 m ) 2 2 p j ( j + 1 m ) + p j 2 ) p j + 1 + ( i = j + 2 n ( i m ) 2 p i )           2 p j ( i = j + 2 n ( i m ) p i ) + p j 2 ( i = j + 2 n p i ) = V ( p ) 2 p j ( i = 1 k ( i m ) p i ) + p j 2 ( i = 1 k p i ) 2 p j ( p j ( j m ) + p j + 1 ( j + 1 m ) )           + p j 2 ( p j + p j + 1 ) + p j ( 2 j 2 m + 1 2 p j ) 2 p j ( i = j + 2 n ( i m ) p i )           + p j 2 ( i = j + 2 n p i ) = V ( p ) 2 p j ( i = 1 n ( i m ) p i ) + p j 2 ( i = 1 n p i ) + p j ( 2 j 2 m + 1 2 p j ) = V ( p ) + p j 2 + 2 j p j 2 m p j + p j 2 p j 2 = V ( p ) + 2 p j ( j m ) + p j ( 1 p j ) > V ( p ) .
 ☐
Lemma 7.
Given a distribution p ( x ) over X = { 1 , 2 , ... , n } , there exists a distribution q ( x ) with q 1 + q n = 1 and q i = 0 for each i X \ { 1 , n } , satisfying V ( q ) V ( p ) .
Proof. 
First, consider the trivial case of p i = 1 for some i X . Let q 1 = 1 , then V ( q ) = V ( p ) holds, obviously. Next, consider the case of p i < 1 for each i X .
Let m = i = 1 n i p i denote the mean of p ( x ) , k denote the greatest integer satisfying 1 < k m and p k > 0 , and j denote the smallest integer satisfying m < j < n and p j > 0 . We can generate a new distribution q ( x ) by repeatedly applying Lemma 5 to move each p i k gradually toward p 1 , and by repeatedly applying Lemma 6 to move each p i j gradually toward p n . As a result, q 1 = i = 1 k   p i , q n = i = j n p i , and q i = 0 for each i X \ { 1 , n } , and V ( q ) > V ( p ) . ☐
Lemma 8.
Given a distribution p ( x ) over X = { 1 , 2 , ... , n } where p 1 + p n = 1 and p i = 0 for each i X \ { 1 , n } , V ( p ) is maximized when p 1 = p n = 0.5 .
Proof. 
Without loss of generality, let p 1 = 1 2 + δ and p n = 1 2 δ for some δ 0 . Then, Equation (3) yields m = 1 p 1 + n p n = ( 1 2 + δ ) + n ( 1 2 δ ) = 1 + n 2 + δ ( 1 n ) .
If δ = 0 , then p 1 = p n = 0.5 . Use V 0 to denote the value of V ( p ) at δ = 0 . Then,
V 0 = p 1 ( 1 m ) 2 + p n ( n m ) 2 = 1 2 ( 1 + n 2 1 ) 2 + 1 2 ( n 1 + n 2 ) 2 = ( n 1 2 ) 2 . V ( p ) = ( 1 2 + δ ) ( m 1 ) 2 + ( 1 2 δ ) ( n m ) 2         = ( 1 2 + δ ) ( 1 + n 2 + δ ( 1 n ) 1 ) 2 + ( 1 2 δ ) ( n 1 + n 2 δ ( 1 n ) ) 2 = ( 1 2 + δ ) ( n 1 2 + δ ( 1 n ) ) 2 + ( 1 2 δ ) ( n 1 2 δ ( 1 n ) ) 2 = 1 2 ( n 1 2 + δ ( 1 n ) ) 2 + δ ( n 1 2 + δ ( 1 n ) ) 2 + 1 2 ( n 1 2 δ ( 1 n ) ) 2         δ ( n 1 2 δ ( 1 n ) ) 2 = ( n 1 2 ) 2 + δ 2 ( 1 n ) 2 + 4 δ 2 ( 1 n ) ( n 1 2 ) = ( n 1 2 ) 2 δ 2 ( 1 n ) 2 V 0 .
 ☐
Corollary 2.
Given a probability distribution p ( x ) over X = { 1 , 2 , ... , n } , 0 V ( p ) ( n 1 2 ) 2 holds.
Proof. 
The upper bound ( n 1 2 ) 2 is the direct result from Lemmas 7 and 8, and occurs when p 1 = p n = 0.5 . The lower bound 0 is by the definition of V ( p ) in Equation (6), and occurs when p i = 1 for some i X . ☐

Appendix C

In this section, we derived the range of S ( p ) , where p is a probability distribution over X = { 1 , 2 , ... , n } with mean m . First, Lemma 9 is used to split the probability at x = j into the probabilities at x = 1 and at x = m for 1 < j < m . We can repeatedly apply Lemma 9 until p j = 0 for 1 < j < m , and yield a new probability distribution q such that S ( q ) > S ( p ) .
Lemma 9.
Let p ( x ) be a probability distribution over X = { 1 , 2 , ... , n } . Let m = i = 1 n i p i and k = m . If there exists p j > 0 where 1 < j < k , then S ( q ) > S ( p ) where q ( x ) is a probability distribution over X with q 1 = p 1 + k j k 1 p j , q j = 0 , q k = p k + j 1 k 1 p j , and q i = p i for i X \ { 1 , j , k } .
Proof. 
By Equation (3), the mean of q ( x ) is also m .
S ( q ) = ( i 1 , j , k | m i | 3 p i ) + ( m 1 ) 3 ( p 1 + k j k 1 p j ) + ( m j ) 3 ( 0 ) + ( m k ) 3 ( p k + j 1 k 1 p j ) = ( i 1 , j , k | m i | 3 p i ) + ( m 1 ) 3 p 1 + ( m 1 ) 3 ( k j k 1 p j ) + ( m k ) 3 p k + ( m k ) 3 ( j 1 k 1 p j ) = ( i j | m i | 3 p i ) + ( m 1 ) 3 ( k j k 1 p j ) + ( m k ) 3 ( j 1 k 1 p j ) = ( i j | m i | 3 p i ) + ( p j k 1 ) ( ( m 1 ) 3 ( k j ) + ( m k ) 3 ( j 1 ) ) = ( i j | m i | 3 p i )         + ( p j k 1 ) ( ( m 3 3 m 2 + 3 m 1 ) ( k j ) + ( m 3 3 m 2 k + 3 m k 2 k 3 ) ( j 1 ) ) = ( i j | m i | 3 p i )         + ( p j k 1 ) ( m 3 k 3 m 2 k + 3 m k k m 3 j + 3 m 2 j 3 m j + j + m 3 j 3 m 2 k j         + 3 m k 2 j k 3 j m 3 + 3 m 2 k 3 m k 2 + k 3 ) = ( i j | m i | 3 p i )         + ( p j k 1 ) ( ( m 3 k m 3 ) + ( 3 m k 3 m k 2 ) + ( k 3 k ) + ( 3 m 2 j 3 m 2 k j )         + ( 3 m k 2 j 3 m j ) + ( j k 3 j ) ) = ( i j | m i | 3 p i )         + ( p j k 1 ) ( m 3 ( k 1 ) + 3 m k ( 1 k ) + k ( k 2 1 ) + 3 m 2 j ( 1 k ) + 3 m j ( k 2 1 )         + j ( 1 k 3 ) ) = ( i j | m i | 3 p i ) + p j ( m 3 3 m k + k ( k + 1 ) 3 m 2 j + 3 m j ( k + 1 ) j ( k 2 + k + 1 ) ) = ( i j | m i | 3 p i )         + p j ( ( m 3 3 m 2 j + 3 m j 2 j 3 ) 3 m j 2 + j 3 3 m k + k ( k + 1 ) + 3 m j ( k + 1 )         j ( k 2 + k + 1 ) ) = S ( p ) + p j ( 3 m j 2 + j 3 3 m k + k ( k + 1 ) + 3 m j ( k + 1 ) j ( k 2 + k + 1 ) ) = S ( p ) + p j ( 3 m ( j 2 k + j ( k + 1 ) ) + k ( k + 1 ) ( 1 j ) + j ( j 2 1 ) ) = S ( p ) + p j ( 3 m ( j 1 ) ( k j ) k ( k + 1 ) ( j 1 ) + j ( j + 1 ) ( j 1 ) ) = S ( p ) + p j ( j 1 ) ( 3 m ( k j ) k ( k + 1 ) + j ( j + 1 ) ) = S ( p ) + p j ( j 1 ) ( 3 m ( k j ) ( k 2 j 2 ) ( k j ) ) = S ( p ) + p j ( j 1 ) ( k j ) ( 3 m ( k + j ) 1 ) .
Then 1 < j < k = m m < n yields 3 m ( k + j ) 1 > 0 . Thus, S ( q ) > S ( p ) . ☐
Similar to Lemma 9, Lemma 10 is used to split the probability at x = j into the probabilities at x = m and at x = n for m < j < n . We can repeatedly apply Lemma 10 until p j = 0 for m < j < n , and yield a new probability distribution q such that S ( q ) > S ( p ) .
Lemma 10.
Let p ( x ) be a probability distribution over X = { 1 , 2 , ... , n } . Let m = i = 1 n i p i and k = m . If there exists p j > 0 where k < j < n , then S ( q ) > S ( p ) where q ( x ) is a probability distribution over X with q k = p k + n j n k p j , q j = 0 , q n = p n + j k n k p j , and q i = p i for i X \ { k , j ,   n } .
Proof. 
By Equation (3), the mean of q ( x ) is also m .
S ( q ) = ( i k , j ,   n | m i | 3 p i ) + ( k m ) 3 ( p k + n j n k p j ) + ( j m ) 3 ( 0 )         + ( n m ) 3 ( p n + j k n k p j ) = ( i k , j ,   n | m i | 3 p i ) + ( k m ) 3 p k + ( k m ) 3 ( n j n k p j ) + ( n m ) 3 p n         + ( n m ) 3 ( j k n k p j ) = ( i j | m i | 3 p i ) + ( k m ) 3 ( n j n k p j ) + ( n m ) 3 ( j k n k p j ) = ( i j | m i | 3 p i ) + ( p j n k ) ( ( k m ) 3 ( n j ) + ( n m ) 3 ( j k ) ) = ( i j | m i | 3 p i )         + ( p j n k ) ( ( k 3 3 k 2 m + 3 k m 2 m 3 ) ( n j )         + ( n 3 3 n 2 m + 3 n m 2 m 3 ) ( j k ) ) = ( i j | m i | 3 p i )         + ( p j n k ) ( k 3 n 3 k 2 m n + 3 k m 2 n m 3 n k 3 j + 3 k 2 m j 3 k m 2 j + m 3 j + n 3 j         3 n 2 m j + 3 n m 2 j m 3 j n 3 k + 3 n 2 m k 3 n m 2 k + m 3 k ) = ( i j | m i | 3 p i )         + ( p j n k ) ( ( 3 k 2 m n + 3 n 2 m k ) + ( 3 k 2 m j 3 n 2 m j ) + ( 3 k m 2 j + 3 n m 2 j )         + ( m 3 n + m 3 k ) + ( k 3 j + n 3 j ) + ( k 3 n n 3 k ) ) = ( i j | m i | 3 p i )         + ( p j n k ) ( 3 k m n ( n k ) 3 m j ( n 2 k 2 ) + 3 m 2 j ( n k ) m 3 ( n k )         + j ( n 3 k 3 ) n k ( n 2 k 2 ) ) = ( i j | m i | 3 p i ) + p j ( 3 k m n 3 m j ( n + k ) + 3 m 2 j m 3 + j ( n 2 + n k + k 2 ) n k ( n + k ) ) = ( i j | m i | 3 p i )         + p j ( ( j 3 3 m j 2 + 3 m 2 j m 3 ) + 3 m j 2 j 3 + 3 k m n 3 m j ( n + k )         + j ( n 2 + n k + k 2 ) n k ( n + k ) ) = S ( p ) + p j ( 3 m j 2 j 3 + 3 k m n 3 m j ( n + k ) + j ( n 2 + n k + k 2 ) n k ( n + k ) ) = S ( p ) + p j ( 3 m ( j k ) ( j n ) + n ( n + k ) ( j k ) j ( j 2 k 2 ) ) = S ( p ) + p j ( j k ) ( 3 m ( j n ) + n ( n + k ) j ( j + k ) ) = S ( p ) + p j ( j k ) ( 3 m ( j n ) + ( n 2 j 2 ) + k ( n j ) ) = S ( p ) + p j ( j k ) ( n j ) ( 3 m + ( n + j ) + k ) .
Then 1 < m m = k < j < n yields 3 m + ( n + j ) + k > 0 . Thus, S ( q ) > S ( p ) . ☐
Lemmas 11, 12, and 13 are used to split the probabilities at x = m , x = m , and x = m , respectively, into x = 1 and x = n .
Lemma 11.
Let p ( x ) be a probability distribution over X = { 1 , 2 , ... , n } . Let m = i = 1 n i p i . If m X and p m > 0 , then S ( q ) > S ( p ) where q ( x ) is a probability distribution over X with q 1 = p 1 + n m n 1 p m , q m = 0 , q n = p n + m 1 n 1 p m , and q i = p i for i X \ { 1 , m ,   n } .
Proof. 
By Equation (3), the mean of q ( x ) is also m .
S ( q ) = ( i 1 , m ,   n | m i | 3 p i ) + ( m 1 ) 3 ( p 1 + n m n 1 p m ) + | m m | 3 ( 0 )         + ( n m ) 3 ( p n + m 1 n 1 p m ) = S ( p ) + ( m 1 ) 3 ( n m n 1 p m ) + ( n m ) 3 ( m 1 n 1 p m ) = S ( p ) + p m ( m 1 ) ( n m ) n 1 ( ( m 1 ) 2 + ( n m ) 2 )   > S ( p ) .
 ☐
Lemma 12.
Let p ( x ) be a probability distribution over X = { 1 , 2 , ... , n } , m = i = 1 n i p i , and k = m . If 1 < k < m and p k > 0 , then S ( q ) > S ( p ) where q ( x ) is a probability distribution over X with q 1 = p 1 + n k n 1 p k , q k = 0 , q n = p n + k 1 n 1 p k , and q i = p i for i X \ { 1 , k , n } .
Proof. 
By Equation (3), the mean of q ( x ) is also m .
S ( q ) = ( i 1 , k , n | m i | 3 p i ) + ( m 1 ) 3 ( p 1 + n k n 1 p k ) + ( m k ) 3 ( 0 ) + ( n m ) 3 ( p n + k 1 n 1 p k ) = S ( p ) + ( m 1 ) 3 ( n k n 1 p k ) + ( n m ) 3 ( k 1 n 1 p k ) ( m k ) 3 p k = S ( p ) + p k n 1 ( ( m 1 ) 3 ( n k ) + ( n m ) 3 ( k 1 ) ( m k ) 3 ( n 1 ) ) = S ( p ) + p k n 1 ( ( m 1 ) 3 ( n k ) + ( n m ) 3 ( k 1 ) ( m k ) 3 ( n k + k 1 ) ) = S ( p ) + p k n 1 ( ( m 1 ) 3 ( n k ) ( m k ) 3 ( n k ) + ( n m ) 3 ( k 1 ) ( m k ) 3 ( k 1 ) ) = S ( p ) + p k n 1 ( ( m k + k 1 ) 3 ( n k ) ( m k ) 3 ( n k ) + ( n m ) 3 ( k 1 ) ( m k ) 3 ( k 1 ) ) > S ( p ) + p k n 1 ( ( k 1 ) 3 ( n k ) + ( n m ) 3 ( k 1 ) ( m k ) 3 ( k 1 ) ) = S ( p ) + p k ( k 1 ) n 1 ( ( k 1 ) 2 ( n k ) + ( n m ) 3 ( m k ) 3 ) .
Then, k 1 1 > m k > 0 and n > m yield ( k 1 ) 2 > ( m k ) 2 > 0 and n k > m k , and thus, ( k 1 ) 2 ( n k ) > ( m k ) 3 . Therefore, S ( q ) > S ( p ) holds. ☐
Lemma 13.
Let p ( x ) be a probability distribution over X = { 1 , 2 , ... , n } , m = i = 1 n i p i , and k = m . If m < k < n and p k > 0 , then S ( q ) > S ( p ) where q ( x ) is a probability distribution over X with q 1 = p 1 + n k n 1 p k , q k = 0 , q n = p n + k 1 n 1 p k , and q i = p i for i X \ { 1 , k , n } .
Proof. 
By Equation (3), the mean of q ( x ) is also m .
S ( q ) = ( i 1 , k , n | m i | 3 p i ) + ( m 1 ) 3 ( p 1 + n k n 1 p k ) + ( k m ) 3 ( 0 ) + ( n m ) 3 ( p n + k 1 n 1 p k ) = S ( p ) + ( m 1 ) 3 ( n k n 1 p k ) + ( n m ) 3 ( k 1 n 1 p k ) ( k m ) 3 p k = S ( p ) + p k n 1 ( ( m 1 ) 3 ( n k ) + ( n m ) 3 ( k 1 ) ( k m ) 3 ( n 1 ) ) = S ( p ) + p k n 1 ( ( m 1 ) 3 ( n k ) + ( n m ) 3 ( k 1 ) ( k m ) 3 ( n k + k 1 ) ) = S ( p ) + p k n 1 ( ( m 1 ) 3 ( n k ) ( k m ) 3 ( n k ) + ( n m ) 3 ( k 1 ) ( k m ) 3 ( k 1 ) ) = S ( p ) + p k n 1 ( ( m 1 ) 3 ( n k ) ( k m ) 3 ( n k ) + ( n k + k m ) 3 ( k 1 ) ( k m ) 3 ( k 1 ) ) > S ( p ) + p k n 1 ( ( m 1 ) 3 ( n k ) ( k m ) 3 ( n k ) + ( n k ) 3 ( k 1 ) ) = S ( p ) + p k ( n k ) n 1 ( ( m 1 ) 3 ( k m ) 3 + ( n k ) 2 ( k 1 ) ) .
Then, n k 1 > k m > 0 and m > 1 yield ( n k ) 2 > ( k m ) 2 > 0 and k 1 > k m , and thus, ( n k ) 2 ( k 1 ) > ( k m ) 3 . Therefore, S ( q ) > S ( p ) holds. ☐
Given a probability distribution p ( x ) with the probability concentrating at both ends, Lemma 14 shows that S ( p ) is maximized when the probability is evenly distributed.
Lemma 14.
Given a probability distribution p ( x ) over X = { 1 , 2 , ... , n } where p 1 + p n = 1 and p i = 0 for each i X \ { 1 , n } , S ( p ) is maximized when p 1 = p n = 0.5 .
Proof. 
Without loss of generality, let p 1 = 1 2 + δ and p n = 1 2 δ where 0 δ 1 2 . Then, Equation (3) yields m = p 1 + n p n = ( 1 2 + δ ) + n ( 1 2 δ ) = 1 + n 2 + δ ( 1 n ) .
Use S 0 to denote the value of S ( p ) at δ = 0 . Then,
S 0 = p 1 ( m 1 ) 3 + p n ( n m ) 3 = 1 2 ( 1 + n 2 1 ) 3 + 1 2 ( n 1 + n 2 ) 3 = ( n 1 2 ) 3 . S ( p ) = ( 1 2 + δ ) ( m 1 ) 3 + ( 1 2 δ ) ( n m ) 3 = ( 1 2 + δ ) ( 1 + n 2 + δ ( 1 n ) 1 ) 3 + ( 1 2 δ ) ( n 1 + n 2 δ ( 1 n ) ) 3 = ( 1 2 + δ ) ( n 1 2 + δ ( 1 n ) ) 3 + ( 1 2 δ ) ( n 1 2 δ ( 1 n ) ) 3 = 1 2 ( n 1 2 + δ ( 1 n ) ) 3 + δ ( n 1 2 + δ ( 1 n ) ) 3 + 1 2 ( n 1 2 δ ( 1 n ) ) 3         δ ( n 1 2 δ ( 1 n ) ) 3 = 1 2 ( ( n 1 2 ) 3 + 3 ( n 1 2 ) 2 δ ( 1 n ) + 3 ( n 1 2 ) δ 2 ( 1 n ) 2 + δ 3 ( 1 n ) 3 )     + 1 2 ( ( n 1 2 ) 3 3 ( n 1 2 ) 2 δ ( 1 n ) + 3 ( n 1 2 ) δ 2 ( 1 n ) 2 δ 3 ( 1 n ) 3 )     +   δ ( ( n 1 2 ) 3 + 3 ( n 1 2 ) 2 δ ( 1 n ) + 3 ( n 1 2 ) δ 2 ( 1 n ) 2 + δ 3 ( 1 n ) 3 )     δ ( ( n 1 2 ) 3 3 ( n 1 2 ) 2 δ ( 1 n ) + 3 ( n 1 2 ) δ 2 ( 1 n ) 2 δ 3 ( 1 n ) 3 ) = ( ( n 1 2 ) 3 + 3 ( n 1 2 ) δ 2 ( 1 n ) 2 ) + 2 δ ( 3 ( n 1 2 ) 2 δ ( 1 n ) + δ 3 ( 1 n ) 3 ) = ( n 1 2 ) 3 2 δ 4 ( n 1 ) 3 S 0 .
Corollary 3.
Given a probability distribution p ( x ) over X = { 1 , 2 , ... , n } , 0 S ( p ) ( n 1 2 ) 3 holds.
Proof. 
The lower bound 0 is by the definition of S ( p ) in Equation (6), and occurs when p i = 1 for some i X . The upper bound ( n 1 2 ) 3 is the direct result from Lemmas 9 to 14, and occurs when p 1 = p n = 0.5 . First, we can repeatedly apply Lemmas 9 and 10 to yield a new distribution q ( x ) such that S ( q ) > S ( p ) and q j = 0 for 1 < j < m and for m < j < n . Then, we apply Lemmas 11, 12, and 13 to yield a new probability distribution r such that S ( r ) > S ( q ) and r j = 0 for 1 < j < n . Finally, we apply Lemma 14 to show that S ( r ) S 0 where S 0 = ( n 1 2 ) 3 is the value of S ( r ) when r 1 = r n = 0.5 . ☐

References

  1. Tastle, W.J.; Wierman, M.J. Consensus and dissention: A measure of ordinal dispersion. Int. J. Approx. Reason. 2007, 45, 531–545. [Google Scholar] [CrossRef]
  2. Mierlo, H.; Vermunt, J.K.; Rutte, C.G. Composing group-level constructs from individual-level survey data. Organ. Res. Methods 2009, 12, 368–392. [Google Scholar] [CrossRef]
  3. O’Neill, T.A. An overview of interrater agreement on likert scales for researchers and practitioners. Front. Psychol. 2017, 8. [Google Scholar] [CrossRef] [PubMed]
  4. Elzinga, C.; Wang, H.; Lin, Z.; Kumar, Y. Concordance and consensus. Inf. Sci. 2011, 181, 2529–2549. [Google Scholar] [CrossRef]
  5. Akiyama, Y.; Nolan, J.; Darrah, M.; Abdal Rahem, M.; Wang, L. A method for measuring consensus within groups: An index of disagreement via conditional probability. Inf. Sci. 2016, 345, 116–128. [Google Scholar] [CrossRef]
  6. Burke, M.J.; Finkelstein, L.M.; Dusig, M.S. On average deviation indices for estimating interrater agreement. Organ. Res. Methods 1999, 2, 49–68. [Google Scholar] [CrossRef]
  7. Jost, L. Entropy and diversity. Oikos 2006, 113, 363–375. [Google Scholar] [CrossRef]
  8. Tastle, W.J.; Wierman, M.J. Corrigendum to: “Consensus and dissention: A measure of ordinal dispersion” [Int. J. Approx. Reasoning 45 (2007) 531–545]. Int. J. Approx. Reason. 2010, 51. [Google Scholar] [CrossRef]
Figure 1. Steps to calculate Φ m v ( p ) for a probability distribution p ( x ) (revised from Reference [5]).
Figure 1. Steps to calculate Φ m v ( p ) for a probability distribution p ( x ) (revised from Reference [5]).
Entropy 20 00408 g001
Figure 2. Distributions of consensus scores based on different consensus measures.
Figure 2. Distributions of consensus scores based on different consensus measures.
Entropy 20 00408 g002
Figure 3. Φ 1 vs. other consensus measures. (a) Φ 1   vs .   Φ e ; (b) Φ 1   vs .   Φ 2 ; (c) Φ 1   vs .   Φ 3 ; and (d) Φ 1   vs .   Φ m v .
Figure 3. Φ 1 vs. other consensus measures. (a) Φ 1   vs .   Φ e ; (b) Φ 1   vs .   Φ 2 ; (c) Φ 1   vs .   Φ 3 ; and (d) Φ 1   vs .   Φ m v .
Entropy 20 00408 g003
Figure 4. Scatter charts of Φ e ,   Φ 2 ,   Φ 3 , and Φ m v . (a) Φ e   vs .   Φ 2 ; (b) Φ e   vs .   Φ 3 ; (c) Φ e   vs .   Φ m v ; (d) Φ 2   vs .   Φ 3 ; (e) Φ 2   vs .   Φ m v ; and (f) Φ 3   vs .   Φ m v .
Figure 4. Scatter charts of Φ e ,   Φ 2 ,   Φ 3 , and Φ m v . (a) Φ e   vs .   Φ 2 ; (b) Φ e   vs .   Φ 3 ; (c) Φ e   vs .   Φ m v ; (d) Φ 2   vs .   Φ 3 ; (e) Φ 2   vs .   Φ m v ; and (f) Φ 3   vs .   Φ m v .
Entropy 20 00408 g004
Table 1. From p ( x ) to q ( x ) , consensus score reduces the most in Φ 3 , less in Φ 2 , and least in Φ 1 .
Table 1. From p ( x ) to q ( x ) , consensus score reduces the most in Φ 3 , less in Φ 2 , and least in Φ 1 .
A D V S Φ 1 Φ 2 Φ 3
p ( x ) 1.223.60.40.50.55
q ( x ) 1.262.14753.93690.370.4631250.507888
Φ ( p ) Φ ( q ) ---0.030.036880.04211
Table 2. Some examples of the probability distribution p ( x ) satisfying Φ m v ( p ) = 1 or 0.
Table 2. Some examples of the probability distribution p ( x ) satisfying Φ m v ( p ) = 1 or 0.
p 1 p 2 p 3 p 4 p 5 Φ m v ( p )
100001
0.750.250001
0.500.500001
00.960.40001
0.500000.500
0.900000.100
0.960000.040
0.980000.020
Table 3. The distribution of the mean values of the 316,251 generated probability distributions.
Table 3. The distribution of the mean values of the 316,251 generated probability distributions.
Range of Mean 1 m 2 2 < m 3 3 < m 4 4 < m 5
Number of probability distributions16,390143,747140,87815,236
Probability5.18%45.45%44.55%4.82%
Table 4. Kendall rank correlation coefficients between consensus measures using all 316,251 probability distributions.
Table 4. Kendall rank correlation coefficients between consensus measures using all 316,251 probability distributions.
Φ 1 Φ e Φ 2 Φ 3 Φ m v
Φ 1 10.9902020.9677550.8872520.925708
Φ e 0.99020210.990080.9406350.964478
Φ 2 0.9677550.9900810.9694190.970876
Φ 3 0.8872520.9406350.96941910.974605
Φ m v 0.9257080.9644780.9708760.9746051
Table 5. Kendall rank correlation coefficients between consensus measures using the 16,390 probability distributions where 1 m 2 .
Table 5. Kendall rank correlation coefficients between consensus measures using the 16,390 probability distributions where 1 m 2 .
Φ 1 Φ e Φ 2 Φ 3 Φ m v
Φ 1 10.9671100.9301170.7740930.785614
Φ e 0.96711010.9854890.9041470.891701
Φ 2 0.9301170.98548910.9421860.900688
Φ 3 0.7740930.9041470.94218610.940492
Φ m v 0.7856140.8917010.9006880.9404921
Table 6. Kendall rank correlation coefficients between consensus measures using the 15,236 probability distributions where 4 < m 5 .
Table 6. Kendall rank correlation coefficients between consensus measures using the 15,236 probability distributions where 4 < m 5 .
Φ 1 Φ e Φ 2 Φ 3 Φ m v
Φ 1 10.9656860.9303520.7721320.776873
Φ e 0.96568610.9866040.9050850.886878
Φ 2 0.9303520.98660410.9418090.897223
Φ 3 0.7721320.9050850.94180910.939574
Φ m v 0.7768730.8868780.8972230.9395741
Table 7. Mean differences between any two consensus measures.
Table 7. Mean differences between any two consensus measures.
Φ 1 Φ e Φ 2 Φ 3 Φ m v
Φ 1 00.03810110.12582460.166064890.15698299
Φ e 0.038101100.08954910.12819880.1429986
Φ 2 0.12582460.089549100.04917330.14278058
Φ 3 0.166064890.12819880.049173300.149693
Φ m v 0.156982990.14299860.142780580.1496930
Table 8. Maximum differences between two consensus measures.
Table 8. Maximum differences between two consensus measures.
Φ 1 Φ e Φ 2 Φ 3 Φ m v
Φ 1 00.1089960.250.3750.9216
Φ e 0.10899600.1650370.2900370.858559
Φ 2 0.250.16503700.2496610.9216
Φ 3 0.3750.2900370.24966100.849347
Φ m v 0.92160.8585590.92160.8493470
Table 9. Some examples of the probability distribution p ( x ) , and their consensus scores.
Table 9. Some examples of the probability distribution p ( x ) , and their consensus scores.
Example Number p 1 p 2 p 3 p 4 p 5 Φ 1 Φ e Φ 2 Φ 3 Φ m v
10.980000.020.92160.8585590.92160.8493470
20.900000.100.640.5310040.640.40960
30.860000.140.51840.4157610.51840.2687390
4000.5000.500.50.5849630.750.8750.833333
50.02000.160.820.80320.7969820.89440.856910.833333
60.9800.02000.96080.9663920.98040.9811680.833333
70.98000.0200.94120.9403130.95590.9364430.166667
80.0.96000.040.88480.8843540.91360.8803530.99176
Table 10. The probability of scores based on one consensus measure to be equal to or less than scores based on another consensus measure for the 316,251 generated probability distributions.
Table 10. The probability of scores based on one consensus measure to be equal to or less than scores based on another consensus measure for the 316,251 generated probability distributions.
Φ e Φ 2 Φ 3 Φ m v
Φ 1 94.66%100%96.41%58.12%
Φ e -100%96.96%52.04%
Φ 2 --84.35%36.84%
Φ 3 ---28.01%

Share and Cite

MDPI and ACS Style

Lin, J.-L. Numerical Analysis of Consensus Measures within Groups. Entropy 2018, 20, 408. https://doi.org/10.3390/e20060408

AMA Style

Lin J-L. Numerical Analysis of Consensus Measures within Groups. Entropy. 2018; 20(6):408. https://doi.org/10.3390/e20060408

Chicago/Turabian Style

Lin, Jun-Lin. 2018. "Numerical Analysis of Consensus Measures within Groups" Entropy 20, no. 6: 408. https://doi.org/10.3390/e20060408

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop