# Quantifying Data Dependencies with Rényi Mutual Information and Minimum Spanning Trees

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Dependence and Entropy

#### 2.1. Rényi Entropy and Divergence

#### 2.2. Entropy as Measure of Dependence

#### 2.3. Estimator of the Rényi Entropy

#### 2.4. Quantifier of Dependence

#### 2.5. Proof of Concept

## 3. Approximation Methods

#### 3.1. Sampling-Based MST

#### 3.2. Cluster-Based MST

#### 3.3. Multilevel MST

#### 3.4. Comparison

## 4. Validation of the Proposed FMST Estimator

#### 4.1. Comparison with the Exact Value of the Rényi Divergence

## 5. Test Cases

#### 5.1. Ishigami Function

#### 5.2. The El Niño Dataset

## 6. Discussion

## Author Contributions

## Funding

## Conflicts of Interest

## References

- Sullivan, T.J. Introduction to Uncertainty Quantification; Springer: Berlin, Germany, 2015; Volume 63. [Google Scholar]
- Ghanem, R.; Higdon, D.; Owhadi, H. Handbook of Uncertainty Quantification; Springer: Berlin, Germany, 2017. [Google Scholar]
- Le Maître, O.P.; Knio, O.M. Spectral Methods for Uncertainty Quantification: With Applications to Computational Fluid Dynamics; Scientific Computation; Springer: Berlin, Germany, 2010. [Google Scholar]
- Helton, J.C.; Johnson, J.D.; Sallaberry, C.J.; Storlie, C.B. Survey of sampling-based methods for uncertainty and sensitivity analysis. Reliab. Eng. Syst. Saf.
**2006**, 91, 1175–1209. [Google Scholar] [CrossRef] [Green Version] - Saltelli, A.; Ratto, M.; Andres, T.; Campolongo, F.; Cariboni, J.; Gatelli, D.; Saisana, M.; Tarantola, S. Global Sensitivity Analysis: The Primer; John Wiley & Sons: Hoboken, NJ, USA, 2008. [Google Scholar]
- Iooss, B.; Lemaître, P. A Review on Global Sensitivity Analysis Methods. In Uncertainty Management in Simulation-Optimization of Complex Systems; Springer: Berlin, Germany, 2015; pp. 101–122. [Google Scholar]
- Cover, T.M.; Thomas, J.A. Elements of Information Theory, 2nd ed.; John Wiley & Sons: Hoboken, NJ, USA, 2006. [Google Scholar]
- Hero, A.O.; Ma, B.; Michel, O.J.J.; Gorman, J. Applications of Entropic Spanning Graphs. IEEE Signal Process. Mag.
**2002**, 19, 85–95. [Google Scholar] [CrossRef] - Hero, A.O.; Costa, J.; Ma, B. Asymptotic Relations between Minimal Graphs and α-Entropy; Technical Report 334; Electrical Engineering and Computer Science—Communications and Signal Processing Laboratory, University of Michigan: Ann Arbor, MI, USA, 2003. [Google Scholar]
- Hero, A.O.; Michel, O.J.J. Robust Entropy Estimation Strategies Based on Edge Weighted Random Graphs. Proc. SPIE
**1998**, 3459, 250–261. [Google Scholar] - Kruskal, J.B. On the shortest spanning subtree of a graph and the traveling salesman problem. Proc. Am. Math. Soc.
**1956**, 7, 48–50. [Google Scholar] [CrossRef] - Prim, R.C. Shortest connection networks and some generalizations. Bell Labs Tech. J.
**1957**, 36, 1389–1401. [Google Scholar] [CrossRef] - Auder, B.; Iooss, B. Global sensitivity analysis based on entropy. In Safety, Reliability and Risk Analysis, Proceedings of the ESREL 2008 Conference, Valencia, Spain, 22–25 September 2008; CRC Press: Boca Raton, FL, USA; pp. 2107–2115.
- Liu, H.; Chen, W.; Sudjianto, A. Relative Entropy Based Method for Probabilistic Sensitivity Analysis in Engineering Design. J. Mech. Des.
**2006**, 128, 326–336. [Google Scholar] [CrossRef] - van Erven, T.; Harremoes, P. Rényi Divergence and Kullback-Leibler Divergence. IEEE Trans. Inf. Theory
**2014**, 60, 3797–3820. [Google Scholar] [CrossRef] [Green Version] - Pál, D.; Póczos, B.; Szepesvári, C. Estimation of Rényi entropy and mutual information based on generalized nearest-neighbor graphs. In Proceedings of the 23rd International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 6–9 December 2010; pp. 1849–1857. [Google Scholar]
- Moon, K.; Sricharan, K.; Greenewald, K.; Hero, A. Ensemble Estimation of Information Divergence. Entropy
**2018**, 20, 560. [Google Scholar] [CrossRef] - Hero, A.; Michel, O.J.J. Estimation of Rényi information divergence via pruned minimal spanning trees. In Proceedings of the IEEE Signal Processing Workshop on Higher-Order Statistics, Caesarea, Israel, 16 June 1999; pp. 264–268. [Google Scholar]
- Rosenblatt, M. Remarks on a multivariate transformation. Ann. Math. Stat.
**1952**, 23, 470–472. [Google Scholar] [CrossRef] - Torre, E.; Marelli, S.; Embrechts, P.; Sudret, B. A general framework for data-driven uncertainty quantification under complex input dependencies using vine copulas. Probab. Eng. Mech.
**2018**. [Google Scholar] [CrossRef] - Conover, W.J. The rank transformation—An easy and intuitive way to connect many nonparametric methods to their parametric counterparts for seamless teaching introductory statistics courses. Wiley Interdiscip. Rev. Comput. Stat.
**2012**, 4, 432–438. [Google Scholar] [CrossRef] - Spearman, C. The proof and measurement of association between two things. Am. J. Psychol.
**1904**, 15, 72–101. [Google Scholar] [CrossRef] - Csiszár, I.; Shields, P.C. Information theory and statistics: A tutorial. Found. Trends Commun. Inf. Theory
**2004**, 1, 417–528. [Google Scholar] [CrossRef] - Hero, A.O.; Ma, B.; Michel, O.; Gorman, J. Alpha-Divergence for Classification, Indexing and Retrieval; Technical Report CSPL-328; Communication and Signal Processing Laboratory, University of Michigan: Ann Arbor, MI, USA, 2001. [Google Scholar]
- Silverman, B.W. Density Estimation for Statistics and Data Analysis; Chapman & Hall/CRC: Boca Raton, FL, USA, 1986. [Google Scholar]
- Noshad, M.; Moon, K.R.; Sekeh, S.Y.; Hero, A.O. Direct estimation of information divergence using nearest neighbor ratios. In Proceedings of the 2017 IEEE International Symposium on Information Theory (ISIT), Aachen, Germany, 25–30 June 2017; pp. 903–907. [Google Scholar]
- Zhong, C.; Malinen, M.; Miao, D.; Fränti, P. A fast minimum spanning tree algorithm based on K-means. Inf. Sci.
**2015**, 295, 1–17. [Google Scholar] [CrossRef] - Celebi, M.E.; Kingravi, H.A.; Vela, P.A. A comparative study of efficient initialization methods for the K-means clustering algorithm. Expert Syst. Appl.
**2013**, 40, 200–210. [Google Scholar] [CrossRef] - Steinhaus, H. Sur la division des corps matériels en parties. Bull. de l’Académie Polonaise des Sci.
**1956**, IV, 801–804. (In French) [Google Scholar] - Eggels, A.W.; Crommelin, D.T.; Witteveen, J.A.S. Clustering-based collocation for uncertainty propagation with multivariate dependent inputs. Int. J. Uncertain. Quantif.
**2018**, 8. [Google Scholar] [CrossRef] - Scott, J. Multivariate Density Estimation; Wiley Series in Probability and Statistics; John Wiley & Sons: Hoboken, NJ, USA, 1992. [Google Scholar]
- Ishigami, T.; Homma, T. An Importance Quantification Technique in Uncertainty Analysis for Computer Models. In Proceedings of the First International Symposium on Uncertainty Modeling and Analysis, College Park, MD, USA, 3–5 December 1990; pp. 398–403. [Google Scholar]
- Sobol’, I.M. Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates. Math. Comput. Simul.
**2001**, 55, 271–280. [Google Scholar] [CrossRef] - Crestaux, T.; Le Maître, O.; Martinez, J.M. Polynomial chaos expansion for sensitivity analysis. Reliab. Eng. Syst. Saf.
**2009**, 94, 1161–1172. [Google Scholar] [CrossRef] - Dheeru, D.; Karra Taniskidou, E. UCI Machine Learning Repository. 2017. Available online: http://archive.ics.uci.edu/ml (accessed on 22 October 2018).
- Pfahl, S.; Niedermann, N. Daily covariations in near-surface relative humidity and temperature over the ocean. J. Geophys. Res.
**2011**, 116. [Google Scholar] [CrossRef] [Green Version]

**Figure 1.**Illustration of the MST for two datasets on the unit square. One dataset is sampled from a bivariate independent distribution (

**left**), and the other from a strongly nonlinear dependent distribution (

**right**).

**Figure 2.**Empirical distribution of ${H}_{\alpha}^{*}(Z)$ (Equation (10)) for the bivariate uniform distribution.

**Figure 3.**Empirical distributions of ${H}_{\alpha}^{*}(Z)$ for data from the normal distribution and the two shape distributions, for small $\rho $ and A.

**Figure 4.**Comparison of the Rényi entropy (

**left**) and quantifier of dependence (

**right**) for the normal distribution and shape distributions, with varying parameters ($\rho $ and A). The Rényi entropy is computed exactly using Equation (2) without transforming the data. For visualization purposes, the values for the normal distribution have been translated by its value for $\rho =0$, which is $2log(2\sqrt{2\pi})$. The quantifier in Equation (10) was evaluated using data that were sampled from the distribution and then rank-transformed. We show the mean and 95% confidence intervals of ${H}_{\alpha}^{*}(Z)$. Note that the numerical values of the entropy and ${H}_{\alpha}^{*}(X)$ are not supposed to coincide (cf. Equations (8) and (10)).

**Figure 6.**Empirical CDF of ${H}_{\alpha}^{*}(Z)$ for the bivariate uniform distribution. The CDF becomes narrower for larger datasets (increasing N).

**Figure 7.**Error estimates for different approximation methods. The figure on the left is for independent data, while the figure on the right is made with dependent data. From left to right, the methods are: FMST, random sampling from subsets-based MST, stratified sampling from subsets-based MST, K-means cluster-based MST (unweighted and weighted) and PCA-based cluster-based MST (unweighted and weighted).

**Figure 8.**Empirical distribution for the uniform distribution for varying N. The solid lines refer to the distributions based on the MST, while the dashed lines refer to the distributions based on the FMST. Results using MST are limited to $N\le {10}^{3}$ because of high computational cost.

**Figure 9.**Behavior of the FMST estimator for two different types of data distribution. On the left the mean and empirical 95%-confidence intervals, while on the right the means of the MST (diamonds) and FMST (squares) estimates can be compared (see also Figure 4).

**Figure 11.**Empirical CDF of ${H}_{\alpha}^{*}(X)$ for the bivariate uniform distribution. The CDF becomes narrower for larger datasets (increasing N).

**Figure 12.**Estimates of ${H}_{\alpha}^{*}(Z)$ for two datasets. Note the difference in the range of the y-axis.

**Figure 13.**Scatterplots for the Ishigami function with on the left the uniform dataset and on the right the strongly dependent dataset. The dependent input data have a large effect on the output distribution.

**Table 1.**Comparison of the proposed estimator and Sobol indices for the Ishigami function with independent (uniformly distributed) input variables.

${\mathit{S}}_{\mathit{i}}$ | ${\mathit{S}}_{\mathit{Ti}}$ | ${\mathit{H}}_{\mathit{\alpha}}^{\mathbf{*}}\mathbf{\left(}\mathit{Z}\mathbf{\right)}$ | |
---|---|---|---|

${x}_{1}$ | 0.314 | 0.558 | −0.570 |

${x}_{2}$ | 0.442 | 0.442 | −1.133 |

${x}_{3}$ | 0 | 0.244 | −0.420 |

${x}_{4}$ | 0 | 0 | −0.421 |

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Eggels, A.; Crommelin, D.
Quantifying Data Dependencies with Rényi Mutual Information and Minimum Spanning Trees. *Entropy* **2019**, *21*, 100.
https://doi.org/10.3390/e21020100

**AMA Style**

Eggels A, Crommelin D.
Quantifying Data Dependencies with Rényi Mutual Information and Minimum Spanning Trees. *Entropy*. 2019; 21(2):100.
https://doi.org/10.3390/e21020100

**Chicago/Turabian Style**

Eggels, Anne, and Daan Crommelin.
2019. "Quantifying Data Dependencies with Rényi Mutual Information and Minimum Spanning Trees" *Entropy* 21, no. 2: 100.
https://doi.org/10.3390/e21020100