# Zipf’s, Heaps’ and Taylor’s Laws are Determined by the Expansion into the Adjacent Possible

^{1}

^{2}

^{3}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Zipf’s and Heaps’ Laws

#### 2.1. Frequency-Rank Relations: the Estoup-Zipf’s Law

#### 2.2. The Innovation Rate: Herdan-Heaps’ Law

#### 2.3. Zipf’s vs. Heaps’ Laws

## 3. Urn Model with Triggering

#### The Role of the Adjacent Possible: Heaps’ and Zipf’s Laws in the Classic Multicolors Pólya Urn Model

## 4. Connection of the Urn Model with Triggering and with Stochastic Processes Featuring Innovation

#### 4.1. Urn Model with Triggering and the Poisson-Dirichlet Process

#### 4.2. Urn Model with Triggering, Dirichlet Process and Hoppe Model

## 5. Fluctuation Scaling (Taylor’s Law)

## 6. Discussion

- The first point we make is about the many claims made in literature about the possibility to deduce the Heaps’ law by simply sampling a Zipf-like distribution of frequencies of events. Though, as seen above, it is possible to deduce a power-law behaviour for the growth of distinct elements by randomly drawing from a Zipf-like distribution, this procedure does not allow to reproduce the empirical results. It has been conjectured in [26], that texts are subject to a topicality phenomenon, i.e., writers do not sample the same Zipf’s law. This implies that the same word can appear at different ranking positions depending on the specific context. Though this is an interesting point, we think that the deduction of the Heaps’ law from the sampling of a Zipfian distribution is not satisfactory from two different points of view. First of all, the empirical Heaps’ and Zipf’s laws are never pure power-laws. We have seen for instance that for written texts the frequency-rank plot features a double slope. Nevertheless, we have seen that a relation exists between the exponent of the frequency-rank distribution at high ranks (rare words) and the asymptotic exponent of the Heaps’ law. In other words, the behaviour of the rarest words is responsible for the entrance rate of new words (or new items). Even though a pure power-law behaviour was observed, we have shown that the statistics of fluctuations, represented by the Taylor’s law, would not reproduce the empirical results (unless a specific sampling procedure based on the hypothesis of topicality is adopted [26]). The conclusion to be taken is that in general the Heaps’ and the Zipf’s laws are non-trivially related and their explanation should be made based instead on first-principle.
- Models featuring a fixed space of possibilities are not able to reproduce the simultaneous occurrence of the three laws. For instance, a multicolor Polya’s urn model [29] does not even produce power-law-like behaviours for the Zipf’s and the Heaps’ laws. It rather features a saturation phenomenon, related to the exploration of the predefined boundaries of the space of possibilities. The conclusion here is that one needs a modelling scheme featuring a space of possibilities with dynamical boundaries, for instance expanding ones.
- Models that incorporate the possibility to expand the space of possibilities like the Yule-Simon [40] model or the Hoppe model fail in explaining the empirical results. In the Yule-Simon model, the innovation rate is constant and the the Heaps’ law is reproduced with the trivial unitary exponent. An ad-hoc correction to this has been proposed by Zanette and Montemurro [41], who postulate a sublinear power-law Heaps’s law form the outset, without providing any first-principle explanation for it. In addition, in this case the result is not satisfactory because the resulting time-series does not obey the Taylor’s law, being instead compatible with a series of i.i.d variables. The question is now why this approach is not reproducing Taylor’s law despite the fact that it fixes the expansion of the space of possibilities. In our opinion what is lacking in the scheme by Zanette and Montemurro is the interplay between the preferential attachment mechanism and the exploration of new possibilities. In other words, the triggering effect which is instead a key features of the PUT model (see next item). The situation for the Hoppe model is different [36], i.e., a multicolor Polya’s urn with a special replicator color. In this case, though a self-consistent expansion of the space of possibilities is in place, an explicit mechanism of triggering, in which the realization of an innovation facilitates the realization of further innovations, lacks. In this case the innovation rate is too weak and the Heap’s law features only a logarithmic growth, i.e., it is slower than any power-law sublinear behaviour.
- The Polya’s urn model with triggering (PUT) [10], incorporating the notion of adjacent possible, allows to simultaneously account for the three laws, Zipf’s, Heaps’ and Taylor’s, in all their regimes, without ad-hoc or arbitrary assumptions. In this case, the space of possibilities expands conditional to the occurrence of novel events in a way that is compatible with the empirical findings. From the mathematical point of view, the expansion into the adjacent possible solves another issue related to Zipf’s and Heaps’ generative models. In fact, in PUT one can switch with continuity from the sublinear to the linear regime of the dictionary growth and vice-versa and this by tuning one parameter only: the ratio $\nu /\rho $. This ratio is not limited to a ratio of integers. In fact, in the SI of [10] it was demonstrated that the same expressions for the Heaps’ and Zipf’s laws are recovered if one uses parameters $\rho $ and $\nu $ extracted from a distribution with fixed means. One possible strategy is to fix an integer $\rho $ while $\nu $ can assume any value in the real numbers (in simulations this is a floating point value), and the mantissa can be taken into account by resorting to probabilities. Therefore, it is perfectly sound to state that one switches with continuity from the sublinear regime to the linear one in the interval $|\nu /\rho -1|<\epsilon $, with $\epsilon \ll 1$, although the rigorous mathematical characterization of the transition is far from being understood.
- It should be remarked that the Poisson-Dirichlet process [12,13,14] is also able to explain the three Zipf’s, Heaps’ and Taylor’s laws only in the strict sub-linear regime for the Heaps’ law. It cannot however account for a constant innovation rate as in the PUT modelling scheme. We also point out that the PUT model embraces the Poisson-Dirichlet and the Dirichlet processes as particular cases.

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## Abbreviations

## Appendix A. Analytic Derivation of Heaps’ law in the Urn Model with Triggering

#### Appendix A.1. Case 1: ρ>ν

#### Appendix A.2. Case 2: ρ<ν

## Appendix B. Analytic Determination of Zipf’s Law in the Continuous Approximation

#### Appendix B.1. Preliminary Considerations

#### Appendix B.2. Master Equation

#### Appendix B.3. Particular case ν=ρ

#### Appendix B.4. Particular Case ν=0

## References

- Estoup, J.B. Les Gammes Sténographiques; Institut Sténographique de France: Paris, France, 1916. [Google Scholar]
- Zipf, G.K. Relative Frequency as a Determinant of Phonetic Change. Harvard Stud. Class. Philol.
**1929**, 40, 1–95. [Google Scholar] [CrossRef] - Zipf, G.K. The Psychobiology of Language; Houghton-Mifflin: New York, NY, USA, 1935. [Google Scholar]
- Zipf, G.K. Human Behavior and the Principle of Least Effort; Addison-Wesley: Reading, MA, USA, 1949. [Google Scholar]
- Herdan, G. Type-Token Mathematics: A Textbook of Mathematical Linguistics; Janua linguarum. Series Maior. No. 4; Mouton en Company: The Hague, The Netherlands, 1960. [Google Scholar]
- Heaps, H.S. Information Retrieval-Computational and Theoretical Aspects; Academic Press: Orlando, FL, USA, 1978. [Google Scholar]
- Taylor, L. Aggregation, Variance and the Mean. Nature
**1961**, 189, 732. [Google Scholar] [CrossRef] - Kauffman, S.A. Investigations: The Nature of Autonomous Agents and the Worlds They Mutually Create; SFI Working Papers; Santa Fe Institute: Santa Fe, NM, USA, 1996. [Google Scholar]
- Kauffman, S.A. Investigations; Oxford University Press: New York, NY, USA; Oxford, UK, 2000. [Google Scholar]
- Tria, F.; Loreto, V.; Servedio, V.D.P.; Strogatz, S.H. The dynamics of correlated novelties. Nat. Sci. Rep.
**2014**, 4. [Google Scholar] [CrossRef] [PubMed] - Loreto, V.; Servedio, V.D.P.; Tria, F.; Strogatz, S.H. Dynamics on expanding spaces: modeling the emergence of novelties. In Universality and Creativity in Language; Altmann, E., Esposti, M.D., Pachet, F., Eds.; Springer: Cham, Switzerland, 2016; pp. 59–83. [Google Scholar]
- Pitman, J. Combinatorial stochastic processes. In Lecture Notes in Mathematics; Springer-Verlag: Berlin, Germany, 2006; Volume 1875, p. x+256. [Google Scholar]
- Buntine, W.; Hutter, M. A Bayesian View of the Poisson-Dirichlet Process. arXiv, 2010; arXiv:1007.0296. [Google Scholar]
- De Blasi, P.; Favaro, S.; Lijoi, A.; Mena, R.H.; Pruenster, I.; Ruggiero, M. Are Gibbs-Type Priors the Most Natural Generalization of the Dirichlet Process? IEEE Trans. Pattern Anal. Mach. Intel.
**2015**, 37, 212–229. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Hart, M. Project Gutenberg. 1971. Available online: http://www.gutenberg.org/.
- Petruszewycz, M. L’histoire de la loi d’Estoup-Zipf: Documents. Math. Sci. Hum.
**1973**, 44, 41–56. [Google Scholar] - Li, W. Zipf’s Law everywhere. Glottometrics
**2002**, 5, 14–21. [Google Scholar] - Newman, M.E.J. Power laws, Pareto distributions and Zipf’s law. Contemp. Phys.
**2005**, 46, 323–351. [Google Scholar] [CrossRef] - Piantadosi, S.T. Zipf’s word frequency law in natural language: A critical review and future directions. Psychon. Bull. Rev.
**2014**, 21, 1112–1130. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Baeza-Yates, R.; Navarro, G. Block addressing indices for approximate text retrieval. J. Am. Soc. Inf. Sci.
**2000**, 51, 69–82. [Google Scholar] [CrossRef] [Green Version] - Baayen, R. Word Frequency Distributions; Number v. 1 in Text, Speech and Language Technology; Springer: Dordrecht, The Netherlands, 2001. [Google Scholar]
- Egghe, L. Untangling Herdan’s law and Heaps’ Law: Mathematical and informetric arguments. J. Am. Soc. Inf. Sci. Technol.
**2007**, 58, 702–709. [Google Scholar] [CrossRef] - Serrano, M.A.; Flammini, A.; Menczer, F. Modeling statistical properties of written text. PLoS ONE
**2009**, 4, e5372. [Google Scholar] [CrossRef] [PubMed] - Lü, L.; Zhang, Z.K.; Zhou, T. Zipf’s law leads to Heaps’ law: Analyzing their relation in finite-size systems. PLoS ONE
**2010**, 5, e14139. [Google Scholar] [CrossRef] [PubMed] - Cristelli, M.; Batty, M.; Pietronero, L. There is More than a Power Law in Zipf. Sci. Rep.
**2012**, 2, 812. [Google Scholar] [CrossRef] [PubMed] - Gerlach, M.; Altmann, E.G. Scaling laws and fluctuations in the statistics of word frequencies. New J. Phys.
**2014**, 16, 113010. [Google Scholar] [CrossRef] [Green Version] - Johnson, S. Where Good Ideas Come From: The Natural History of Innovation; Riverhead Hardcover: New York, NY, USA, 2010. [Google Scholar]
- Wagner, A.; Rosen, W. Spaces of the possible: universal Darwinism and the wall between technological and biological innovation. J. R. Soc. Interface
**2014**, 11, 20131190. [Google Scholar] [CrossRef] [PubMed] - Gouet, R. Strong Convergence of Proportions in a Multicolor P’olya Urn. J. Appl. Probab.
**1997**, 34, 426–435. [Google Scholar] [CrossRef] - Tria, F. The dynamics of innovation through the expansion in the adjacent possible. Nuovo Cim. C Geophys. Space Phys. C
**2016**, 39, 280. [Google Scholar] - Pitman, J. Exchangeable and partially exchangeable random partitions. Probab. Theory Relat. Fields
**1995**, 102, 145–158. [Google Scholar] [CrossRef] - De Finetti, B. La Prévision: Ses Lois Logiques, Ses Sources Subjectives. Annales de l’Institut Henri Poincaré
**1937**, 17, 1–68. [Google Scholar] - Zabell, S. Predicting the unpredictable. Synthese
**1992**, 90, 205–232. [Google Scholar] [CrossRef] - Monechi, B.; Ruiz-Serrano, A.; Tria, F.; Loreto, V. Waves of Novelties in the Expansion into the Adjacent Possible. PLoS ONE
**2017**, 12, e0179303. [Google Scholar] [CrossRef] [PubMed] - Iacopini, I.; Milojević, S.C.V.; Latora, V. Network Dynamics of Innovation Processes. Phys. Rev. Lett.
**2018**, 120, 048301. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Hoppe, F.M. Pólya-like urns and the Ewens’ sampling formula. J. Math. Biol.
**1984**, 20, 91–94. [Google Scholar] [CrossRef] - Ewens, W. The Sampling Theory of Selectively Neutral Alleles. Theor. Popul. Biol.
**1972**, 3, 87–112. [Google Scholar] [CrossRef] - Fisher, R.A. The Genetical Theory of Natural Selection; Clarendon Press: Oxford, UK, 1930. [Google Scholar]
- Wright, S. Evolution in Mendelian populations. Genetics
**1931**, 16, 97. [Google Scholar] [PubMed] - Simon, H. On a class of skew distribution functions. Biometrika
**1955**, 42, 425–440. [Google Scholar] [CrossRef] - Zanette, D.; Montemurro, M. Dynamics of Text Generation with Realistic Zipf’s Distribution. J. Quant. Linguist.
**2005**, 12, 29. [Google Scholar] [CrossRef] - Yamato, H.; Shibuya, M. Moments of some statistics of pitman sampling formula. Bull. Inf. Cybern.
**2000**, 32, 1–10. [Google Scholar]

**Figure 1.**Zipf’s law computed on the Gutenberg corpus [15]. In this case, the exponent of the asymptotic behaviour is $\alpha \simeq 2.25$. Similar behaviours are observed in many other systems.

**Figure 2.**Growth of the number of distinct words computed on the Gutenberg corpus of texts [15]. The position of texts in the corpus is chosen at random. In this case $\gamma \simeq 0.44$. Similar behaviours are observed in many other systems.

**Figure 3.**Urn model with triggering. (

**A**) An element that had previously been drawn from the urn, is drawn again: the element is added to $\mathcal{S}$ and it is put back in the urn along with $\rho $ additional copies of it. (

**B**) An element that never appeared in the sequence is drawn: the element is added to $\mathcal{S}$, put back in the urn along with $\rho $ additional copies of it, and $\nu +1$ brand new and distinct balls are also added to the urn.

**Figure 4.**Heaps’ law (

**left**) and Zipf’s law (

**right**) in the urn model with triggering. Straight lines in the Heaps’ law plots show functions of the form $f\left(x\right)=a{x}^{\gamma}$ with the exponent $\gamma =\nu /\rho $ as predicted by the analytic results and confirmed in the numerical simulations. Straight lines in the Zipf’s law plots show functions of the form $f\left(x\right)=a{x}^{-\alpha}$, with $\alpha ={\gamma}^{-1}=\rho /\nu $.

**Figure 5.**Results for the multicolors Pólya urn model without innovation. Results are reported both from simulations of the process (points) and from the analytical predictions (straight lines), for different values of the initial number of balls ${N}_{0}$ and of the reinforcement parameter $\rho $.

**Left**: Number of different colors $D\left(n\right)$ added in the sequence as a function of the total number t of extracted balls. The curves from analytical predictions of Equation (23) exactly overlap the simulated points.

**Right**: Frequency-rank distribution. Simulations of the process are here reported along with both: (i) the prediction obtained by inverting the relation $R\simeq =\frac{{N}_{0}}{\mathsf{\Gamma}\left(\right)open="("\; close=")">\frac{1}{\rho}}$; (ii) the asymptotic solution, valid for $R\gg 1$, obtained by inverting Equation (23) $f\left(R\right)\simeq \frac{\rho}{{N}_{0}}{\left(\right)}^{{\left(\right)}^{1}}-1$ (refer to [30] for their derivation).

**Figure 6.**Taylor’s law in various generative models. (

**Left panel**) Models that do not display a square root dependence of the dictionary standard deviation versus the dictionary itself are shown in color, the others in gray. Curves are listed from top to bottom according to their visual ordering. The Pólya’s urn model with triggering (PUT) shows an exponent one when $\nu <\rho $ and exponents in the range from 1/2 to ca. 0.87 when $\nu \ge \rho $. The Poisson-Dirichlet (PD) process also displays a unity exponent. (

**Right panel**) Models with a square root dependence of the dictionary standard deviation versus the dictionary itself are shown in color, the rest (highlighted in the left panel) in gray. The models are Zanette-Montemurro (ZM), Random Sampling (RS), Yule-Simon Model (YSM) and the Dirichlet process (Dir). All these four as well as the PUT with parameters $\nu =1$ and $\rho =2$, the PD with $\alpha =1/2$ and $\theta =1$ produce the same Heaps’ law with exponent $\gamma =1/2$. Each curve is the result of 100 runs of ${10}^{6}$ steps each. The dashed lines with exponents 1/2 and 1 are shown as a guide for the eye.

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Tria, F.; Loreto, V.; Servedio, V.D.P.
Zipf’s, Heaps’ and Taylor’s Laws are Determined by the Expansion into the Adjacent Possible. *Entropy* **2018**, *20*, 752.
https://doi.org/10.3390/e20100752

**AMA Style**

Tria F, Loreto V, Servedio VDP.
Zipf’s, Heaps’ and Taylor’s Laws are Determined by the Expansion into the Adjacent Possible. *Entropy*. 2018; 20(10):752.
https://doi.org/10.3390/e20100752

**Chicago/Turabian Style**

Tria, Francesca, Vittorio Loreto, and Vito D. P. Servedio.
2018. "Zipf’s, Heaps’ and Taylor’s Laws are Determined by the Expansion into the Adjacent Possible" *Entropy* 20, no. 10: 752.
https://doi.org/10.3390/e20100752