# Snooker Statistics and Zipf’s Law

## Abstract

**:**

## 1. Introduction

## 2. Methods

`R`language for statistical computing [15].

`R`package

`poweRlaw`[18].

`CueTracker`[19]. Finally, for a fair comparison across the different ranking types and time frames, the data sets were ensured to all be of the same length by taking the top 100 entries of each ranking.

## 3. Results

`poweRlaw`package. Note that this package treats (and plots) the data as a cumulative probability distribution, rather than an explicit ranking. Figure 4 is still a log-log plot, but the values (in this case the number of centuries) are now on the horizontal axis, while on the vertical axis are their cumulative probabilities. For example, the probability that the value is equal to or larger than the maximum value in the ranking is 0.01, given that there are 100 entries in the ranking and that there is a unique maximum value. This is represented by the dot at the bottom-right of the plot. Similarly, the probability that the value is equal to or larger than the second-highest value in the ranking is 0.02 (second dot up from the bottom-right), and so on, until the probability that the value is equal to or larger than the minimum value in the ranking, which of course is 1.0 (represented by the dot at the top-left of the plot).

## 4. Discussion

`CueTracker`[19] for the years considered here. The estimated value of $\alpha $ for a given year was then plotted against the total amount of prize money (in million GBP) available in that year. This plot is shown in Figure 3 (right).

## 5. Conclusions

## Funding

## Data Availability Statement

`CueTracker`(https://cuetracker.net, accessed on 21 October 2022).

## Acknowledgments

## Conflicts of Interest

## References

- Zipf, G.K. The Psychobiology of Language; Houghton-Mifflin: New York, NY, USA, 1935. [Google Scholar]
- Zipf, G.K. Human Behavior and the Principle of Least Effort; Addison-Wesley: New York, NY, USA, 1949. [Google Scholar]
- Piantadosi, S.T. Zipf’s word frequency law in natural language: A critical review and future directions. Psychon. Bull. Rev.
**2014**, 21, 1112–1130. [Google Scholar] [CrossRef] [PubMed][Green Version] - Gibrat, R. Les Inégalités Économiques; Sirey: Paris, France, 1931. [Google Scholar]
- Newman, M.E.J. Power laws, Pareto distributions and Zipf’s law. Contemp. Phys.
**2005**, 46, 323–351. [Google Scholar] [CrossRef][Green Version] - Cazzolla Gatti, R.; Fath, B.; Hordijk, W.; Kauffman, S.A.; Ulanowicz, R. Niche emergence as an autocatalytic process in the evolution of ecosystems. J. Theor. Biol.
**2018**, 454, 110–117. [Google Scholar] [CrossRef] [PubMed][Green Version] - Steel, M.; Hordijk, W.; Kauffman, S.A. Dynamics of a birth-death process based on combinatorial innovation. J. Theor. Biol.
**2020**, 491, 110187. [Google Scholar] [CrossRef][Green Version] - Corominas Murtra, B.; Solé, R.V. Universality of Zipf’s law. Phys. Rev. E
**2010**, 82, 011102. [Google Scholar] [CrossRef] [PubMed][Green Version] - Deng, W.; Li, W.; Cai, X.; Bulou, A.; Wang, Q.A. Universal scaling in sports ranking. New J. Phys.
**2012**, 14, 093038. [Google Scholar] [CrossRef] - Everton, C. The History of Snooker and Billiards; The Book Service: Colchester, UK, 1986. [Google Scholar]
- Morales, J.A.; Sánchez, S.; Flores, J.; Pineda, C.; Gershenson, C.; Cocho, G.; Zizumbo, J.; Rodriguez, R.F.; Iñiguez, G. Generic temporal features of performance rankings in sports and games. EPJ Data Sci.
**2016**, 5, 33. [Google Scholar] [CrossRef][Green Version] - Morales, J.A.; Flores, J.; Gershenson, C.; Pineda, C. Statistical properties of rankings in sports and games. Adv. Complex Syst.
**2021**, 24, 2150007. [Google Scholar] [CrossRef] - Hordijk, W. The power of snooker. Plus Magazine, 2019. Available online: https://plus.maths.org/content/power-snooker(accessed on 19 October 2022).
- Davies, M. The Corpus of Contemporary American English (COCA). 2022. Available online: https://www.english-corpora.org/coca (accessed on 19 October 2022).
- R Core Team. R: A Language and Environment for Statistical Computing. 2021. Available online: https://www.R-project.org (accessed on 19 October 2022).
- Samuels, J.M. Size and the growth of firms. Rev. Econ. Stud.
**1965**, 32, 105–112. [Google Scholar] [CrossRef] - Clauset, A.; Shalizi, C.R.; Newman, M.E.J. Power-law distributions in empirical data. SIAM Rev.
**2009**, 51, 661–703. [Google Scholar] [CrossRef] - Gillespie, C. poweRlaw: Analysis of Heavy Tailed Distributions. 2020. Available online: https://cran.r-project.org/web/packages/poweRlaw (accessed on 19 October 2022).
- Florax, R. CueTracker. 2022. Available online: https://cuetracker.net (accessed on 19 October 2022).
- Dębowski, L. Local grammar-based coding revisited. arXiv
**2022**, arXiv:2209.13636. [Google Scholar] - Axtell, R.L. Zipf distribution of U.S. firm sizes. Science
**2001**, 293, 1818–1820. [Google Scholar] [CrossRef] [PubMed][Green Version] - O’Brien, J.D.; Gleeson, J.P. A complex network approach to ranking professional snooker players. J. Complex Netw.
**2021**, 8, cnab003. [Google Scholar] [CrossRef] - Hordijk, W.; Hordijk, A.; Heidergott, B. A genetic algorithm for finding good balanced sequences in a customer assignment problem with no state information. Asia-Pac. J. Oper. Res.
**2015**, 32, 1550015. [Google Scholar] [CrossRef]

**Figure 1.**A log-log plot of word frequency against rank for the 50 most frequent words from the COCA. The solid line represents an estimated power law fit.

**Figure 2.**The snooker data for the two ranking types and three time frames. Solid lines represent estimated power laws.

**Figure 3.**

**Left**: The change in estimated power law parameter over time.

**Right**: The estimated parameter values for the prize money ranking in a given year, against the total amount of prize money available that year (open circles). The solid line represents a power law fit to the data.

**Figure 4.**Comparison of a power law fit (solid line) with a log-normal fit (dashed line) for the all-time centuries ranking represented as a cumulative probability distribution (open circles).

**Table 1.**Estimated parameter $\alpha $ and goodness-of-fit ${R}^{2}$ for the power laws fitted to the snooker statistics.

Prize Money | Centuries | |||
---|---|---|---|---|

$\mathbf{\alpha}$ | ${\mathit{R}}^{\mathbf{2}}$ | $\mathbf{\alpha}$ | ${\mathit{R}}^{\mathbf{2}}$ | |

All-time | 0.857 | 0.96 | 0.741 | 0.94 |

Decade | 0.984 | 0.95 | 0.837 | 0.92 |

Year | 0.978 | 0.96 | 0.863 | 0.94 |

**Table 2.**Estimated parameters $\mu $ and $\sigma $ for the log-normal distributions fitted to the snooker statistics.

Prize Money | Centuries | |||
---|---|---|---|---|

$\mathbf{\mu}$ | $\mathbf{\sigma}$ | $\mathbf{\mu}$ | $\mathbf{\sigma}$ | |

All-time | 12.75 | 1.42 | 4.52 | 1.07 |

Decade | 12.35 | 1.36 | 4.00 | 1.10 |

Year | 9.33 | 1.64 | 1.76 | 1.10 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Hordijk, W.
Snooker Statistics and Zipf’s Law. *Stats* **2022**, *5*, 985-992.
https://doi.org/10.3390/stats5040058

**AMA Style**

Hordijk W.
Snooker Statistics and Zipf’s Law. *Stats*. 2022; 5(4):985-992.
https://doi.org/10.3390/stats5040058

**Chicago/Turabian Style**

Hordijk, Wim.
2022. "Snooker Statistics and Zipf’s Law" *Stats* 5, no. 4: 985-992.
https://doi.org/10.3390/stats5040058