Next Article in Journal
Auto-Refining Reconstruction Algorithm for Recreation of Limited Angle Humanoid Depth Data
Previous Article in Journal
Video Slice: Image Compression and Transmission for Agricultural Systems
 
 
Reply published on 27 May 2021, see Sensors 2021, 21(11), 3729.
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Comment

The Limits of Pairwise Correlation to Model the Joint Entropy. Comment on Nguyen Thi Thanh et al. Entropy Correlation and Its Impacts on Data Aggregation in a Wireless Sensor Network. Sensors 2018, 18, 3118

1
Information and Communication Technologies, Electronics and Applied Mathematics (ICTEAM), Université Catholique de Louvain, B-1348 Louvain-la-Neuve, Belgium
2
Department of Computing, Imperial College London, London SW7 2AZ, UK
3
Data Science Institute, Imperial College London, London SW7 2AZ, UK
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Sensors 2021, 21(11), 3700; https://doi.org/10.3390/s21113700
Submission received: 17 February 2021 / Revised: 12 May 2021 / Accepted: 24 May 2021 / Published: 26 May 2021
Information theory is a unifying mathematical theory to measure information content, which is key for research in cryptography, statistical physics, and quantum computing [1,2,3]. A central property of information theory is the entropy, a metric quantifying the amount of information encoded in a signal [4]. In “Entropy Correlation and Its Impacts on Data Aggregation in a Wireless Sensor Network”, Nga et al. propose a general entropy correlation model to study the dependence patterns between multiple spatio-temporal signals [5]. They derive lower and upper bounds on the overall information entropy from only marginal and pairwise entropies, and use these bounds to study the impact of correlation on data aggregation, compression, and clustering of signals. Replicating these findings, however, we show that these bounds were incorrect, over- and underestimating the actual association patterns depending on the data. Deriving constraints and bounds on joint entropies is still a computationally difficult task and an active field of research [1,6], and new inequalities are regularly found [7,8,9,10,11]. More work is likely to be needed in order to develop a simple and general entropy correlation model for spatio-temporal signals.
Nga et al. study a system of m random variables X 1 , X 2 , , X m . They propose a normalized measure of correlation between two variables Y and Z, defined as:
ρ ( Y , Z ) = 2 2 H ( Y , Z ) H ( Y ) + H ( Z )
with H the Shannon entropy [4]. The authors further denote by ρ min = min i j ρ ( X i , X j ) and ρ max = max i j ρ ( X i , X j ) the minimum and maximum correlation between pairs of variables; H min = min i H ( X i ) and H max = max i H ( X i ) the minimum and maximum individual entropies.
The general entropy correlation model proposed by the authors rely on two claims, both incorrect:
Claim 1.
In Equation (13) and Section 2.2.2, Nga et al. claim that higher-order correlations are bounded by pairwise correlations:
( i , j , k ) , ρ min ρ ( X i j , X k ) ρ max
Claim 2.
In Equations (16) and (20), Nga et al. use Claim 1 to prove that, for any subset of m variable, its joint entropy H m is bounded by:
l m H min H m k m H max
with l m = 2 ρ max 2 ( l m 1 + 1 ) , k m = 2 ρ min 2 ( k m 1 + 1 ) , and l 1 = k 1 = 1 .
We propose two examples for n = 3 , demonstrating that all four inequalities are incorrect. In our first example, we obtain ρ min > ρ ( X i j , X k ) which contradicts the lower bound of Claim 1 and H 3 > k 3 H max , which contradicts the upper bound of Claim 2.
Proposition 1.
Consider the four i.i.d. discrete random variables Y 1 , Y 2 , Y 3 , Z uniformly distributed over { 0 , 1 } . For the random variables ( X i ) i = 1 3 = ( Y i , Z ) , we have ρ min = 1 / 2 , ρ ( X i j , X k ) = 2 / 5 for any permutation ( i , j , k ) of ( 1 , 2 , 3 ) , k 3 = 15 / 8 , H 3 = 4 and H max = 2 .
Proof. 
As Y 1 , Y 2 , Y 3 , Z are independent, we have H ( X i ) = H ( Y i ) + H ( Z ) = 2 for i = 1 , 2 , 3 and H ( X i j ) = H ( Y i ) + H ( Y j ) + H ( Z ) = 3 for i j . Using Equation (1), we have ρ ( X i , X j ) = 1 / 2 for i j hence ρ min = 1 / 2 , k 2 = 2 ρ min = 3 / 2 and k 3 = ( k 2 + 1 ) k 2 / 2 = 15 / 8 . For any permutation ( i , j , k ) of ( 1 , 2 , 3 ) , we have H 3 = H ( X i j k ) = H ( Y i ) + H ( Y j ) + H ( Y k ) + H ( Z ) = 4 hence ρ ( X i j , X k ) = 2 / 5 . □
In our second example, we obtain ρ max < ρ ( X i j , X k ) , which contradicts the upper bound of Claim 1 and H 3 < l 3 H min , which contradicts the lower bound of Claim 2.
Proposition 2.
Consider three discrete random variables X 1 , X 2 , X 3 uniformly distributed over { 0 , 1 } that are pairwise independent and satisfying the equation X 1 X 2 X 3 = 0 where ⊕ denotes the xor operation. We have ρ max = 0 , ρ ( X i j , X k ) = 2 / 3 for any permutation ( i , j , k ) of ( 1 , 2 , 3 ) , l 3 = 3 , H 3 = 2 and H min = 1 .
Proof. 
We have H ( X i ) = 1 for i = 1 , 2 , 3 and as the variables are pairwise independent, H ( X i j ) = H ( X i ) + H ( X j ) = 2 for i j . Using Equation (1), we have ρ ( X i , X j ) = 0 for i j hence ρ max = 0 , l 2 = 2 ρ max = 2 and l 3 = ( l 2 + 1 ) l 2 / 2 = 3 . For any permutation ( i , j , k ) of ( 1 , 2 , 3 ) , we have H 3 = H ( X i j k ) = 2 hence ρ ( X i j , X k ) = 2 / 3 . □
Overall, the two new inequalities derived by Nga et al. for the joint entropy H m do not appear to be correct starting at m = 3 . The errors in the model stem from the assumption made in Claim 1 that pairwise and higher-order associations share the same minimum and maximum. The authors validate their method on a very specific dataset with ρ min = 0.6 , H min = 2.16 , and H max = 2.55 , yet our examples show that different association structures yield widely different joint entropies. Bounding the joint entropy allows the authors to study the impact of correlation on data aggregation, compression, and clustering of signals. Although different bounds could potentially offer similar results, the broader conclusions of this article may not hold in practice.
Finally, deriving constraints and bounds on joint entropies is a computationally difficult task and an active field of research [1,6,7,8,9,10,11]. Theoretical derivations and numerical estimations both have to be used to bound the joint entropy H m , based upon research on entropic vectors. The entropic vector of the random variables X 1 , X 2 , , X m is the vector of the entropies of all 2 m 1 subsets of these variables. The set of all entropic vectors is a convex cone, for which a polyhedral outer-approximation is known (Theorem 1, [12]). For instance, we derive below tight (the tightness is a consequence of the fact that Equations (2) and (3) completely describe the entropic cone (Theorem 2, [12])) lower and upper bounds for H 3 in Proposition 3, suggesting an alternative approach that could lead to upper bounds for n > 3 and lower bounds as well. This bound relies on the following inequalities (Theorem 2.34, [6]):
H ( X I ) H ( X J )
which is valid for any subsets I J { 1 , , m } and
H ( X I ) + H ( X J ) H ( X I J ) + H ( X I J )
which is valid for any subsets I , J { 1 , , m } .
Proposition 3.
For any three random variables X 1 , X 2 , X 3 , the following inequalities hold:
max ( H ( X 12 ) , H ( X 23 ) , H ( X 31 ) ) H 3 min ( H ( X 31 ) + H ( X 12 ) H ( X 1 ) , H ( X 12 ) + H ( X 23 ) H ( X 2 ) , H ( X 23 ) + H ( X 31 ) H ( X 3 ) ) .
Proof. 
For any permutation ( i , j , k ) of ( 1 , 2 , 3 ) , by Equation (2) with I = { i , j } and J = { i , j , k } , we have H ( X i j ) H ( X i j k ) = H 3 and by Equation (3) with I = { i , j } and J = { j , k } , we have H ( X i j ) + H ( X j k ) H ( X i j k ) + H ( X j ) , which implies that H 3 = H ( X i j k ) H ( X i j ) + H ( X j k ) H ( X j ) . □
Similar bounds can be obtained for m > 3 using Equations (2) and (3) but their tightness is not guaranteed as the entropic cone is not completely described by these inequalities for m > 3 (Theorem 6, [13]). This gap could be reduced numerically by iteratively producing linear cuts, in order to refine the polyhedral outer-approximation of the entropic cone given by Equations (2) and (3) [14]. Taken together, our findings suggest that theoretical derivations ( m 3 ) and numerical approximations ( m > 3 ) on the entropic cone might provide future research directions towards a robust general entropy correlation model.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Yeung, R.W. The Science of Information. In Information Theory and Network Coding; Yeung, R.W., Ed.; Springer: Boston, MA, USA, 2008; pp. 1–4. [Google Scholar]
  2. Lesne, A. Shannon entropy: A rigorous notion at the crossroads between probability, information theory, dynamical systems and statistical physics. Math. Struct. Comput. Sci. 2014, 24, e240311. [Google Scholar] [CrossRef] [Green Version]
  3. Vedral, V. The role of relative entropy in quantum information theory. Rev. Mod. Phys. 2002, 74, 197–234. [Google Scholar] [CrossRef] [Green Version]
  4. Shannon, C.E. A Mathematical Theory of Communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef] [Green Version]
  5. Nguyen Thi Thanh, N.; Nguyen Kim, K.; Ngo Hong, S.; Ngo Lam, T. Entropy correlation and its impacts on data aggregation in a wireless sensor network. Sensors 2018, 18, 3118. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  6. Yeung, R.W. A First Course in Information Theory; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
  7. Matus, F. Infinitely Many Information Inequalities. In Proceedings of the 2007 IEEE International Symposium on Information Theory, Nice, France, 24–29 June 2007; pp. 41–44. [Google Scholar]
  8. Zhang, Z.; Yang, J. On a new non-Shannon-type information inequality. In Proceedings of the Proceedings IEEE International Symposium on Information Theory, Lausanne, Switzerland, 30 June–5 July 2002; p. 235. [Google Scholar]
  9. Makarychev, K.; Makarychev, Y.; Romashchenko, A.; Vereshchagin, N. A new class of non-Shannon-type inequalities for entropies. Commun. Inf. Syst. 2002, 2, 147–166. [Google Scholar] [CrossRef] [Green Version]
  10. Matúš, F. Conditional Independences among Four Random Variables III: Final Conclusion. Comb. Probab. Comput. 1999, 8, 269–276. [Google Scholar] [CrossRef]
  11. Dougherty, R.; Freiling, C.; Zeger, K. Six New Non-Shannon Information Inequalities. In Proceedings of the 2006 IEEE International Symposium on Information Theory, Seattle, WA, USA, 9–14 July 2006; pp. 233–236. [Google Scholar]
  12. Zhang, Z.; Yeung, R.W. A non-Shannon-type conditional inequality of information quantities. IEEE Trans. Inf. Theory 1997, 43, 1982–1986. [Google Scholar] [CrossRef]
  13. Zhang, Z.; Yeung, R.W. On characterization of entropy function via information inequalities. IEEE Trans. Inf. Theory 1998, 44, 1440–1452. [Google Scholar] [CrossRef] [Green Version]
  14. Legat, B.; Jungers, R.M. Parallel optimization on the Entropic Cone. In Proceedings of the 37rd Symposium on Information Theory in the Benelux, Louvain-la-Neuve, Belgium, 19–20 May 2016; pp. 206–211. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Legat, B.; Rocher, L. The Limits of Pairwise Correlation to Model the Joint Entropy. Comment on Nguyen Thi Thanh et al. Entropy Correlation and Its Impacts on Data Aggregation in a Wireless Sensor Network. Sensors 2018, 18, 3118. Sensors 2021, 21, 3700. https://doi.org/10.3390/s21113700

AMA Style

Legat B, Rocher L. The Limits of Pairwise Correlation to Model the Joint Entropy. Comment on Nguyen Thi Thanh et al. Entropy Correlation and Its Impacts on Data Aggregation in a Wireless Sensor Network. Sensors 2018, 18, 3118. Sensors. 2021; 21(11):3700. https://doi.org/10.3390/s21113700

Chicago/Turabian Style

Legat, Benoît, and Luc Rocher. 2021. "The Limits of Pairwise Correlation to Model the Joint Entropy. Comment on Nguyen Thi Thanh et al. Entropy Correlation and Its Impacts on Data Aggregation in a Wireless Sensor Network. Sensors 2018, 18, 3118" Sensors 21, no. 11: 3700. https://doi.org/10.3390/s21113700

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop