# Comparing Information-Theoretic Measures of Complexity in Boltzmann Machines

^{1}

^{2}

^{3}

^{4}

^{5}

^{*}

## Abstract

**:**

## 1. Introduction

#### 1.1. Information-Theoretic Measures of Complexity

#### 1.1.1. Multi-Information, $MI$

#### 1.1.2. Synergistic Information, $SI$

#### 1.1.3. Total Information Flow, $IF$

#### 1.1.4. Geometric Integrated Information, ${\Phi}_{G}$

#### 1.2. Boltzmann Machine

## 2. Results

## 3. Application

## 4. Conclusions

## Acknowledgments

## Author Contributions

## Conflicts of Interest

## Appendix A

**Theorem A1**(Concavity of $IF(X\to {X}^{\prime})$)

**.**

**Proof.**

## References

- Miller, J.H.; Page, S.E. Complex Adaptive Systems: An Introduction to Computational Models of Social Life; Princeton University Press: Princeton, NJ, USA, 2007. [Google Scholar]
- Mitchell, M. Complexity: A Guided Tour; Oxford University Press: Oxford, UK, 2009. [Google Scholar]
- Lloyd, S. Measures of Complexity: A Non-Exhaustive List. Available online: http://web.mit.edu/esd.83/www/notebook/Complexity.PDF (accessed on 16 July 2016).
- Shalizi, C. Complexity Measures. Available online: http://bactra.org/notebooks/complexity-measures.html (accessed on 16 July 2016).
- Crutchfield, J.P. Complex Systems Theory? Available online: http://csc.ucdavis.edu/~chaos/chaos/talks/CSTheorySFIRetreat.pdf (accessed on 16 July 2016).
- Tononi, G.; Sporns, O.; Edelman, G.M. A Measure for Brain Complexity: Relating Functional Segregation and Integration in the Nervous System. Proc. Natl. Acad. Sci. USA
**1994**, 91, 5033–5037. [Google Scholar] [CrossRef] [PubMed] - Oizumi, M.; Albantakis, L.; Tononi, G. From the Phenomenology to the Mechanisms of Consciousness: Integrated Information Theory 3.0. PLoS Comput. Biol.
**2014**, 10, 1–25. [Google Scholar] [CrossRef] [PubMed] - Barrett, A.B.; Seth, A.K. Practical Measures of Integrated Information for Time-Series Data. PLoS Comput. Biol.
**2011**, 7, 1–18. [Google Scholar] [CrossRef] [PubMed] - Oizumi, M.; Amari, S.I.; Yanagawa, T.; Fujii, N.; Tsuchiya, N. Measuring Integrated Information from the Decoding Perspective. PLoS Comput. Biol.
**2016**, 12, 1–18. [Google Scholar] [CrossRef] [PubMed] - Gell-Mann, M. The Quark and the Jaguar: Adventures in the Simple and the Complex; St. Martin’s Griffin: New York, NY, USA, 1994. [Google Scholar]
- McGill, W.J. Multivariate information transmission. Psychometrika
**1954**, 19, 97–116. [Google Scholar] [CrossRef] - Edlund, J.A.; Chaumont, N.; Hintze, A.; Christof Koch, G.T.; Adami, C. Integrated Information Increases with Fitness in the Evolution of Animats. PLoS Comput. Biol.
**2011**, 7, 1–13. [Google Scholar] [CrossRef] [PubMed] - Bialek, W.; Nemenman, I.; Tishby, N. Predictability, complexity, and learning. Neural Comput.
**2001**, 13, 2409–2463. [Google Scholar] [CrossRef] [PubMed] - Grassberger, P. Toward a quantitative theory of self-generated complexity. Int. J. Theor. Phys.
**1986**, 25, 907–938. [Google Scholar] [CrossRef] - Crutchfield, J.P.; Feldman, D.P. Regularities unseen, randomness observed: Levels of entropy convergence. Chaos
**2003**, 13, 25–54. [Google Scholar] [CrossRef] [PubMed] - Nagaoka, H. The exponential family of Markov chains and its information geometry. In Proceedings of the 28th Symposium on Information Theory and Its Applications (SITA2005), Okinawa, Japan, 20–23 November 2005; pp. 601–604. [Google Scholar]
- Ay, N. Information Geometry on Complexity and Stochastic Interaction. MPI MIS PREPRINT 95. 2001. Available online: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.21.6974 (accessed on 3 July 2017).
- Amari, S. Information Geometry and Its Applications; Springer: Chiyoda, Japan, 2016. [Google Scholar]
- Oizumi, M.; Tsuchiya, N.; Amari, S.I. Unified framework for information integration based on information geometry. Proc. Natl. Acad. Sci. USA
**2016**, 113, 14817–14822. [Google Scholar] [CrossRef] [PubMed] - Ay, N. Information Geometry on Complexity and Stochastic Interaction. Entropy
**2015**, 17, 2432–2458. [Google Scholar] [CrossRef] - Csiszár, I.; Shields, P. Information Theory and Statistics: A Tutorial. Found. Trends
^{®}Commun. Inf. Theory**2004**, 1, 417–528. [Google Scholar] - Hertz, J.; Krogh, A.; Palmer, R.G. Introduction to the Theory of Neural Computation; Perseus Publishing: New York, NY, USA, 1991. [Google Scholar]
- Cover, T.M.; Thomas, J.A. Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing); Wiley-Interscience: Hoboken, NJ, USA, 2006. [Google Scholar]

**Figure 1.**Using graphical models, we can visualize different ways to define the “split” constraint on manifold $\mathcal{S}$ in (6). Here, we consider a two-node network $X=({X}_{1},{X}_{2})$ and its spatio-temporal stochastic interactions. (

**A**) $I(X;{X}^{\prime})$ uses constraint (11). (

**B**) $IF(X\to {X}^{\prime})$ uses constraint (7). (

**C**) ${\Phi}_{G}(X\to {X}^{\prime})$ uses constraint (13). Dashed lines represent correlations that either may or may not be present in the input distribution p. We do not represent these correlations with solid lines in order to highlight (with solid lines) the structure imposed on the stochastic matrices. Adapted and modified from [19].

**Figure 2.**The sigmoidal update rule as a function of the inverse-global temperature: As $\beta $ increases, the stochastic update rule becomes closer to the deterministic one given by a step function.

**Figure 3.**(

**a**) measures of complexity when using random weight initializations sampled uniformly between 0 and 1 (averaged over 100 trials, with error bars); (

**b**) the mutual information I upper bounds $IF$ and ${\Phi}_{G}$ when using random weight initializations sampled uniformly between 0 and 1 (averaged over 100 trials, with error bars).

**Figure 4.**Measures of complexity in single instances of using random weight initializations sampled uniformly between $-1$ and 1. (

**a**) scenario 1; (

**b**) scenario 2; (

**c**) constraint (14).

**Figure 5.**A full model (left) can have both intrinsic (blue) and extrinsic (red) causal interactions contributing to its overall dynamics. Split models (

**A**,

**B**) formulated with an undirected output edge (purple) attempt to exclusively quantify extrinsic causal interactions (so as to strictly preserve intrinsic causal interactions after the “split”-projection). However, the output edge can end up explaining away interactions from both external factors Y and (some) internal factors X (red + blue = purple). As a result, such a family of split models does not properly capture the total intrinsic causal interactions present in a system.

**Figure 6.**Incremental Hebbian learning in a 9-node stochastic Hopfield network with synchronous updating (averaged over 100 trials of storing random 9-bit patterns). (

**a**) $\beta =\frac{1}{2}$; (

**b**) $\beta =1$.

**Figure 7.**Incremental Hebbian learning in an N-node deterministic ($\beta \to \infty $) Hopfield network with synchronous updating (averaged over 100 trials of storing random N-bit patterns). (

**a**) N = 9; (

**b**) N = 12.

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Kanwal, M.S.; Grochow, J.A.; Ay, N. Comparing Information-Theoretic Measures of Complexity in Boltzmann Machines. *Entropy* **2017**, *19*, 310.
https://doi.org/10.3390/e19070310

**AMA Style**

Kanwal MS, Grochow JA, Ay N. Comparing Information-Theoretic Measures of Complexity in Boltzmann Machines. *Entropy*. 2017; 19(7):310.
https://doi.org/10.3390/e19070310

**Chicago/Turabian Style**

Kanwal, Maxinder S., Joshua A. Grochow, and Nihat Ay. 2017. "Comparing Information-Theoretic Measures of Complexity in Boltzmann Machines" *Entropy* 19, no. 7: 310.
https://doi.org/10.3390/e19070310