# Criticality in Pareto Optimal Grammars?

^{1}

^{2}

^{3}

^{4}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Methods

#### 2.1. Corpus Description and Preparation

#### 2.2. Word Embeddings and Coarse-Graining

#### 2.3. Maximum-Entropy Models

## 3. Results

## 4. Discussion

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## References

- Ferrer, I.; Cancho, R.; Riordan, O.; Bollobás, B. The consequences of Zipf’s law for syntax and symbolic reference. Proc. R. Soc. B
**2005**, 272, 561–565. [Google Scholar] [CrossRef] [PubMed][Green Version] - Solé, R. Language: Syntax for free? Nature
**2005**, 434, 289. [Google Scholar] [CrossRef] [PubMed] - Corominas-Murtra, B.; Valverde, S.; Solé, R. The ontogeny of scale-free syntax networks: Phase transitions in early language acquisition. Adv. Complex Syst.
**2009**, 12, 371–392. [Google Scholar] [CrossRef] - Arbesman, S.; Strogatz, S.H.; Vitevitch, M.S. The structure of phonological networks across multiple languages. Int. J. Bifurcat. Chaos
**2010**, 20, 679–685. [Google Scholar] [CrossRef][Green Version] - Solé, R.V.; Corominas-Murtra, B.; Valverde, S.; Steels, L. Language networks: Their structure, function, and evolution. Complexity
**2010**, 15, 20–26. [Google Scholar] [CrossRef] - Solé, R.V.; Seoane, L.F. Ambiguity in language networks. Linguist. Rev.
**2015**, 32, 5–35. [Google Scholar] [CrossRef] - Seoane, L.F.; Solé, R. The morphospace of language networks. Sci. Rep.
**2018**, 8, 1–14. [Google Scholar] [CrossRef] - Martinčić-Ipšić, S.; Margan, D.; Mexsxtrović, A. Multilayer network of language: A unified framework for structural analysis of linguistic subsystems. Phys. A
**2016**, 457, 117–128. [Google Scholar] [CrossRef][Green Version] - Christiansen, M.H.; Chater, N. Language as shaped by the brain. Behav. Brain Sci.
**2008**, 31, 489–509. [Google Scholar] [CrossRef][Green Version] - Christiansen, M.H.; Chater, N. Creating Language: Integrating Evolution, Acquisition, and Processing; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
- Tishby, N.; Pereira, F.C.; Bialek, W. The information bottleneck method. arXiv
**2000**, arXiv:physics/0004057. [Google Scholar] - Still, S.; Bialek, W.; Bottou, L. Geometric clustering using the information bottleneck method. In Advances in Neural Information Processing Systems; MIT Press: Cambridge, MA, USA, 2003. [Google Scholar]
- Still, S.; Crutchfield, J.P. Structure or Noise? arXiv
**2007**, arXiv:0708.0654. [Google Scholar] - Still, S.; Crutchfield, J.P.; Ellison, C.J. Optimal Causal Inference; Santa Fe Institute Working Paper #2007-08-024; Santa Fe Institute: Santa Fe, NM, USA, 2007. [Google Scholar]
- Still, S. Information bottleneck approach to predictive inference. Entropy
**2014**, 16, 968–989. [Google Scholar] [CrossRef][Green Version] - Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J.
**2001**, 27, 379–423. [Google Scholar] [CrossRef][Green Version] - Shannon, C.E.; Weaver, W. The Mathematical Theory of Communication; Univ of Illinois Press: Champaign, IL, USA, 1949. [Google Scholar]
- Shalizi, C.R.; Moore, C. What is a macrostate? Subjective observations and objective dynamics. arXiv
**2003**, arXiv:cond-mat/0303625. [Google Scholar] - Israeli, N.; Goldenfeld, N. Coarse-graining of cellular automata, emergence, and the predictability of complex systems. Phys. Rev. E
**2006**, 73, 026203. [Google Scholar] [CrossRef][Green Version] - Görnerup, O.; Jacobi, M.N. A method for finding aggregated representations of linear dynamical systems. Adv. Complex Syst.
**2010**, 13, 199–215. [Google Scholar] [CrossRef] - Pfante, O.; Bertschinger, N.; Olbrich, E.; Ay, N.; Jost, J. Comparison between different methods of level identification. Adv. Complex Syst.
**2014**, 17, 1450007. [Google Scholar] [CrossRef][Green Version] - Wolpert, D.H.; Grochow, J.A.; Libby, E.; DeDeo, S. Optimal High-Level Descriptions of Dynamical Systems; Santa Fe Institute working paper #2015-06-017; Santa Fe Institute: Santa Fe, NM, USA, 2014. [Google Scholar]
- Coello, C. Twenty years of evolutionary multi-objective optimization: A historical view of the field. IEEE Comput. Intell. Mag.
**2006**, 1, 28–36. [Google Scholar] [CrossRef] - Schuster, P. Optimization of multiple criteria: Pareto efficiency and fast heuristics should be more popular than they are. Complexity
**2012**, 18, 5–7. [Google Scholar] [CrossRef] - Seoane, L.F. Multiobjetive Optimization in Models of Synthetic and Natural Living Systems. PhD Thesis, Universitat Pompeu Fabra, Barcelona, Spain, 2016. [Google Scholar]
- Seoane, L.F.; Solé, R. A multiobjective optimization approach to statistical mechanics. arXiv
**2013**, arXiv:1310.6372. [Google Scholar] - Seoane, L.F.; Solé, R. Phase transitions in Pareto optimal complex networks. Phys. Rev. E
**2015**, 92, 032807. [Google Scholar] [CrossRef][Green Version] - Seoane, L.F.; Solé, R. Systems poised to criticality through Pareto selective forces. arXiv
**2015**, arXiv:1510.08697. [Google Scholar] - Seoane, L.F.; Solé, R. Multiobjective optimization and phase transitions. In Proceedings of ECCS; Springer: Cham, Switzerland, 2014; pp. 259–270. [Google Scholar]
- Wolfram, S. Universality and complexity in cellular automata. Phys. D
**1984**, 10, 1–35. [Google Scholar] [CrossRef] - Langton, C.G. Computation at the edge of chaos: Phase transitions and emergent computation. Phys. D
**1990**, 42, 12–37. [Google Scholar] [CrossRef][Green Version] - Mitchell, M.; Hraber, P.; Crutchfield, J.P. Revisiting the edge of chaos: Evolving cellular automata to perform computations. arXiv
**1993**, arXiv:adap-org/9303003. [Google Scholar] - Bak, P. How Nature Works: The Science of Self-Organized Criticality; Springer Science & Business Media: Berlin/Heidelberg, Germany, 1996. [Google Scholar]
- Kauffman, S. At Home in the Universe: The Search for the Laws of Self-Organization and Complexity; Oxford University Press: Oxford, UK, 1996. [Google Scholar]
- Legenstein, R.; Maass, W. What makes a dynamical system computationally powerful. In New Directions in Statistical Signal Processing: From Systems to Brain; MIT Press: Cambridge, MA, USA, 2007; pp. 127–154. [Google Scholar]
- Solé, R. Phase Transitions; Princeton U. Press.: Princeton, NJ, USA, 2011. [Google Scholar]
- Mora, T.; Bialek, W. Are biological systems poised at criticality? J. Stat. Phys.
**2011**, 144, 268–302. [Google Scholar] [CrossRef][Green Version] - Muñoz, M.A. Colloquium: Criticality and dynamical scaling in living systems. Rev. Mod. Phys.
**2018**, 90, 031001. [Google Scholar] [CrossRef][Green Version] - Corpus of Contemporary American English. Available online: http://corpus.byu.edu/coca/ (accessed on 28 January 2020).
- NLTK 3.4.5 documentation. Available online: http://www.nltk.org/ (accessed on 28 January 2020).
- Jaynes, E.T. Information theory and statistical mechanics. Phys. Rev.
**1957**, 106, 620. [Google Scholar] [CrossRef] - Jaynes, E.T. Information theory and statistical mechanics. II. Phys. Rev.
**1957**, 108, 171. [Google Scholar] [CrossRef] - Mora, T.; Walczak, A.M.; Bialek, W.; Callan, C.G. Maximum entropy models for antibody diversity. Proc. Natl. Acad. Sci. USA
**2010**, 107, 5405–5410. [Google Scholar] [CrossRef][Green Version] - Stephens, G.J.; Bialek, W. Statistical mechanics of letters in words. Phys. Rev. E
**2010**, 81, 066119. [Google Scholar] [CrossRef][Green Version] - Harte, J. Maximum Entropy and Ecology: A Theory of Abundance, Distribution, and Energetics; Oxford University Press: Oxford, UK, 2011. [Google Scholar]
- Tkačik, G.; Marre, O.; Mora, T.; Amodei, D.; Berry, M.J., II; Bialek, W. The simplest maximum entropy model for collective behavior in a neural network. J. Stat. Mech.
**2013**, 2013, P03011. [Google Scholar] [CrossRef] - Stephens, G.J.; Mora, T.; Tkačik, G.; Bialek, W. Statistical thermodynamics of natural images. Phys. Rev. Lett.
**2013**, 110, 018701. [Google Scholar] [CrossRef][Green Version] - Tkačik, G.; Mora, T.; Marre, O.; Amodei, D.; Palmer, S.E.; Berry, M.J.; Bialek, W. Thermodynamics and signatures of criticality in a network of neurons. Proc. Natl. Acad. Sci. USA
**2015**, 112, 11508–11513. [Google Scholar] [CrossRef][Green Version] - Lee, E.D.; Broedersz, C.P.; Bialek, W. Statistical mechanics of the US Supreme Court. J. Stat. Phys.
**2015**, 160, 275–301. [Google Scholar] [CrossRef][Green Version] - Sohl-Dickstein, J.; Battaglino, P.B.; DeWeese, M.R. New method for parameter estimation in probabilistic models: minimum probability flow. Phys. Rev. Lett.
**2011**, 107, 220601. [Google Scholar] [CrossRef] - Chomsky, N.; Chomsky, N. An interview on minimalism. In On Nature and Language; Cambridge University Press: Cambridge, UK, 2002; pp. 92–161. [Google Scholar]
- Hauser, M.D.; Chomsky, N.; Fitch, W.T. The faculty of language: what is it, who has it, and how did it evolve? Science
**2002**, 298, 1569–1579. [Google Scholar] [CrossRef] - Berwick, R.C.; Chomsky, N. Why only Us: Language and Evolution; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
- Zipf, G.K. Human behavior and the principle of least effort. 1949. Available online: https://psycnet.apa.org/record/1950-00412-000 (accessed on 28 January 2020).
- Altmann, E.G.; Gerlach, M. Statistical laws in linguistics. In Creativity and Universality in Language; Springer: Cham, Switzerland, 2016; pp. 7–26. [Google Scholar]
- Ferrer, I.; Cancho, R.F.; Solé, R.V. Least effort and the origins of scaling in human language. Proc. Natl. Acad. Sci. USA
**2003**, 100, 788–791. [Google Scholar] [CrossRef][Green Version] - Corominas-Murtra, B.; Fortuny, J.; Solé, R.V. Emergence of Zipf’s law in the evolution of communication. Phys. Rev. E
**2011**, 83, 036115. [Google Scholar] [CrossRef][Green Version] - Corominas-Murtra, B.; Seoane, L.F.; Solé, R. Zipf’s law, unbounded complexity and open-ended evolution. J. R. Soc. Interface
**2018**, 15, 20180395. [Google Scholar] [CrossRef][Green Version] - Bickerton, D. Language and Species; University of Chicago Press: Chicago, IL, USA, 1992. [Google Scholar]
- Deacon, T.W. The Symbolic Species: The Co-Evolution of Language and the Brain; WW Norton & Company: New York, NY, USA, 1998. [Google Scholar]
- Crutchfield, J.P.; Young, K. Inferring statistical complexity. Phys. Rev. Let.
**1989**, 63, 105. [Google Scholar] [CrossRef] - Crutchfield, J.P. The calculi of emergence: computation, dynamics and induction. Physica D
**1994**, 75, 11–54. [Google Scholar] [CrossRef] - Crutchfield, J.P.; Shalizi, C.R. Thermodynamic depth of causal states: Objective complexity via minimal representations. Phys. Rev. E
**1999**, 59, 275. [Google Scholar] [CrossRef][Green Version]

**Figure 1.**Different levels of grammar. Language contains several layers of complexity that can be gauged using different kinds of measures and are tied to different kinds of problems. The background picture summarizes the enormous combinatorial potential connecting different levels, from the alphabet (smaller sphere) to grammatically correct sentences (larger sphere). On top of this, it is possible to describe each layer by means of a coarse-grained symbolic dynamics approach. One particularly relevant level is the one associated to the way syntax allows generating grammatically correct strings $x(t)$. As indicated in the left diagram, symbols succeed each other following some rules $\varphi $. A coarse-graining $\pi $ groups up symbols in a series of classes such that the names of these classes; ${x}_{R}(t)$ also generate some symbolic dynamics whose rules are captured by $\psi $. How much information can the dynamics induced by $\psi $ recover about the original dynamics induced by $\varphi $? Good choices of $\pi $ and $\psi $ will preserve as much information as possible despite being relatively simple.

**Figure 2.**Interactions between spins and word classes. (

**a**) A first crude model with spins encloses more information than we need for the kind of calculations that we wish to do right now. (

**b**) A reduced version of that model gives us an interaction energy between words or classes of words. These potentials capture some non-trivial features of English syntax, e.g., the existential “there” in “there is” or modal verbs (marked E and M respectively) have a lower interaction energy if they are followed by verbs. Interjections present fairly large interaction energy with any other word, perhaps as a consequence of their independence within sentences.

**Figure 3.**Pareto optimal maximum entropy models of human language. Among all the models that we try out, we prefer those Pareto optimal in energy minimization and entropy maximization. (

**a**) These reveal a hierarchy of models in which different word classes group up at different levels. The clustering reveals a series of grammatical classes that belong together owing to the statistical properties of the symbolic dynamics, such as possessives and determiners which appear near to adjectives. (

**b**) A first approximation to the Pareto front of the problem. Future implementations will try out more grammatical classes and produce better quality Pareto fronts, establishing whether phase transitions or criticality are truly present.

Conjunction | Adverb |

Cardinal number | Adverb, comparative |

Determiner | Adverb, superlative |

Existential there | to |

Preposition | Interjection |

Adjective | Verb, base form |

Adjective, comparative | Verb, past tense |

Adjective, superlative | Verb, gerund or present participle |

Modal | Verb, past participle |

Noun, singular | Verb, non-3rd person singular present |

Noun, plural | Verb, 3rd person singular present |

Proper noun, singular | Wh-determiner |

Proper noun, plural | Wh-pronoun |

Predeterminer | Possessive wh-pronoun |

Possessive ending | Wh-adverb |

Personal pronoun | None of the above |

Possessive pronoun | ‘.’ |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Seoane, L.F.; Solé, R.
Criticality in Pareto Optimal Grammars? *Entropy* **2020**, *22*, 165.
https://doi.org/10.3390/e22020165

**AMA Style**

Seoane LF, Solé R.
Criticality in Pareto Optimal Grammars? *Entropy*. 2020; 22(2):165.
https://doi.org/10.3390/e22020165

**Chicago/Turabian Style**

Seoane, Luís F, and Ricard Solé.
2020. "Criticality in Pareto Optimal Grammars?" *Entropy* 22, no. 2: 165.
https://doi.org/10.3390/e22020165