Information theory is a unifying mathematical theory to measure information content, which is key for research in cryptography, statistical physics, and quantum computing [1,2,3]. A central property of information theory is the entropy, a metric quantifying the amount of information encoded in a signal [4]. In “Entropy Correlation and Its Impacts on Data Aggregation in a Wireless Sensor Network”, Nga et al. propose a general entropy correlation model to study the dependence patterns between multiple spatio-temporal signals [5]. They derive lower and upper bounds on the overall information entropy from only marginal and pairwise entropies, and use these bounds to study the impact of correlation on data aggregation, compression, and clustering of signals. Replicating these findings, however, we show that these bounds were incorrect, over- and underestimating the actual association patterns depending on the data. Deriving constraints and bounds on joint entropies is still a computationally difficult task and an active field of research [1,6], and new inequalities are regularly found [7,8,9,10,11]. More work is likely to be needed in order to develop a simple and general entropy correlation model for spatio-temporal signals.
Nga et al. study a system of m random variables . They propose a normalized measure of correlation between two variables Y and Z, defined as:
with H the Shannon entropy [4]. The authors further denote by and the minimum and maximum correlation between pairs of variables; and the minimum and maximum individual entropies.
The general entropy correlation model proposed by the authors rely on two claims, both incorrect:
Claim 1.
In Equation (13) and Section 2.2.2, Nga et al. claim that higher-order correlations are bounded by pairwise correlations:
Claim 2.
In Equations (16) and (20), Nga et al. use Claim 1 to prove that, for any subset of m variable, its joint entropy is bounded by:
with , , and .
We propose two examples for , demonstrating that all four inequalities are incorrect. In our first example, we obtain which contradicts the lower bound of Claim 1 and , which contradicts the upper bound of Claim 2.
Proposition 1.
Consider the four i.i.d. discrete random variables uniformly distributed over . For the random variables , we have , for any permutation of , , and .
Proof.
As are independent, we have for and for . Using Equation (1), we have for hence , and . For any permutation of , we have hence . □
In our second example, we obtain , which contradicts the upper bound of Claim 1 and , which contradicts the lower bound of Claim 2.
Proposition 2.
Consider three discrete random variables uniformly distributed over that are pairwise independent and satisfying the equation where ⊕ denotes the xor operation. We have , for any permutation of , , and .
Proof.
We have for and as the variables are pairwise independent, for . Using Equation (1), we have for hence , and . For any permutation of , we have hence . □
Overall, the two new inequalities derived by Nga et al. for the joint entropy do not appear to be correct starting at . The errors in the model stem from the assumption made in Claim 1 that pairwise and higher-order associations share the same minimum and maximum. The authors validate their method on a very specific dataset with , , and , yet our examples show that different association structures yield widely different joint entropies. Bounding the joint entropy allows the authors to study the impact of correlation on data aggregation, compression, and clustering of signals. Although different bounds could potentially offer similar results, the broader conclusions of this article may not hold in practice.
Finally, deriving constraints and bounds on joint entropies is a computationally difficult task and an active field of research [1,6,7,8,9,10,11]. Theoretical derivations and numerical estimations both have to be used to bound the joint entropy , based upon research on entropic vectors. The entropic vector of the random variables is the vector of the entropies of all subsets of these variables. The set of all entropic vectors is a convex cone, for which a polyhedral outer-approximation is known (Theorem 1, [12]). For instance, we derive below tight (the tightness is a consequence of the fact that Equations (2) and (3) completely describe the entropic cone (Theorem 2, [12])) lower and upper bounds for in Proposition 3, suggesting an alternative approach that could lead to upper bounds for and lower bounds as well. This bound relies on the following inequalities (Theorem 2.34, [6]):
which is valid for any subsets and
which is valid for any subsets .
Proposition 3.
For any three random variables , the following inequalities hold:
Proof.
Similar bounds can be obtained for using Equations (2) and (3) but their tightness is not guaranteed as the entropic cone is not completely described by these inequalities for (Theorem 6, [13]). This gap could be reduced numerically by iteratively producing linear cuts, in order to refine the polyhedral outer-approximation of the entropic cone given by Equations (2) and (3) [14]. Taken together, our findings suggest that theoretical derivations () and numerical approximations () on the entropic cone might provide future research directions towards a robust general entropy correlation model.
Funding
This research received no external funding.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Yeung, R.W. The Science of Information. In Information Theory and Network Coding; Yeung, R.W., Ed.; Springer: Boston, MA, USA, 2008; pp. 1–4. [Google Scholar]
- Lesne, A. Shannon entropy: A rigorous notion at the crossroads between probability, information theory, dynamical systems and statistical physics. Math. Struct. Comput. Sci. 2014, 24, e240311. [Google Scholar] [CrossRef]
- Vedral, V. The role of relative entropy in quantum information theory. Rev. Mod. Phys. 2002, 74, 197–234. [Google Scholar] [CrossRef]
- Shannon, C.E. A Mathematical Theory of Communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef]
- Nguyen Thi Thanh, N.; Nguyen Kim, K.; Ngo Hong, S.; Ngo Lam, T. Entropy correlation and its impacts on data aggregation in a wireless sensor network. Sensors 2018, 18, 3118. [Google Scholar] [CrossRef] [PubMed]
- Yeung, R.W. A First Course in Information Theory; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
- Matus, F. Infinitely Many Information Inequalities. In Proceedings of the 2007 IEEE International Symposium on Information Theory, Nice, France, 24–29 June 2007; pp. 41–44. [Google Scholar]
- Zhang, Z.; Yang, J. On a new non-Shannon-type information inequality. In Proceedings of the Proceedings IEEE International Symposium on Information Theory, Lausanne, Switzerland, 30 June–5 July 2002; p. 235. [Google Scholar]
- Makarychev, K.; Makarychev, Y.; Romashchenko, A.; Vereshchagin, N. A new class of non-Shannon-type inequalities for entropies. Commun. Inf. Syst. 2002, 2, 147–166. [Google Scholar] [CrossRef]
- Matúš, F. Conditional Independences among Four Random Variables III: Final Conclusion. Comb. Probab. Comput. 1999, 8, 269–276. [Google Scholar] [CrossRef]
- Dougherty, R.; Freiling, C.; Zeger, K. Six New Non-Shannon Information Inequalities. In Proceedings of the 2006 IEEE International Symposium on Information Theory, Seattle, WA, USA, 9–14 July 2006; pp. 233–236. [Google Scholar]
- Zhang, Z.; Yeung, R.W. A non-Shannon-type conditional inequality of information quantities. IEEE Trans. Inf. Theory 1997, 43, 1982–1986. [Google Scholar] [CrossRef]
- Zhang, Z.; Yeung, R.W. On characterization of entropy function via information inequalities. IEEE Trans. Inf. Theory 1998, 44, 1440–1452. [Google Scholar] [CrossRef]
- Legat, B.; Jungers, R.M. Parallel optimization on the Entropic Cone. In Proceedings of the 37rd Symposium on Information Theory in the Benelux, Louvain-la-Neuve, Belgium, 19–20 May 2016; pp. 206–211. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).