# Sparse Power-Law Network Model for Reliable Statistical Predictions Based on Sampled Data

^{1}

^{2}

^{3}

^{4}

^{5}

^{6}

^{7}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Statistical Terms

#### 2.1. Projectivity

#### 2.2. Exchangeability

## 3. Characterization of Relevant Sparse Network Models from the Statistical Perspective

#### 3.1. Barabási–Albert Model

#### 3.2. Uncorrelated Network Ensembles

## 4. Impasse with Sparsity

- (1)
- the average degree cannot be constant, it must diverge with N (but possibly slower than linearly),
- (2)
- exchangeability is completely redefined: it is not with respect to node labels $1,\dots ,N$, but with respect to artificial labels which are positive real numbers.

#### Proposed Solution of the Impasse Based on Network Geometry

## 5. Statistical Mechanics Model with Hidden Variables

#### 5.1. The Model

#### 5.2. The Strength of a Node and Its Dependence on the Hidden Variable $\theta $

#### 5.3. Strength Distribution

#### 5.4. Connection Probability

#### 5.5. Degree Distribution in the Sparse Regime

#### 5.6. Random Permutation of the Node Sequence

#### 5.7. Entropy of the Network Model

## 6. Statistical Testing of the Model

## 7. Conclusions

## Acknowledgments

## Author Contributions

## Conflicts of Interest

## References and Notes

- Barabási, A.L. Network Science; Cambridge University Press: Cambridge, UK, 2016. [Google Scholar]
- Newman, M. Networks: An Introduction; Oxford University Press: Oxford, UK, 2010. [Google Scholar]
- Estrada, E. The Structure of Complex Networks: Theory and Applications; Oxford University Press: Oxford, UK, 2012. [Google Scholar]
- Latora, V.; Nicosia, V.; Russo, G. Complex Networks: Principles, Methods and Applications; Cambridge University Press: Cambridge, UK, 2017. [Google Scholar]
- Barabási, A.L.; Albert, R. Emergence of scaling in random networks. Science
**1999**, 286, 509–512. [Google Scholar] [PubMed] - Bianconi, G.; Barabási, A.L. Competition and multiscaling in evolving networks. EPL (Europhys. Lett.)
**2001**, 54, 436. [Google Scholar] [CrossRef] - Dorogovtsev, S.N.; Mendes, J.F.F.; Samukhin, A.N. Structure of growing networks with preferential linking. Phys. Rev. Lett.
**2000**, 85, 4633. [Google Scholar] [CrossRef] [PubMed] - Krapivsky, P.L.; Redner, S.; Leyvraz, F. Connectivity of growing random networks. Phys. Rev. Lett.
**2000**, 85, 4629. [Google Scholar] [CrossRef] [PubMed] - Bianconi, G.; Darst, R.K.; Iacovacci, J.; Fortunato, S. Triadic closure as a basic generating mechanism of communities in complex networks. Phys. Rev. E
**2014**, 90, 042806. [Google Scholar] [CrossRef] [PubMed] - Krapivsky, P.; Redner, S. Emergent network modularity. J. Stat. Mech. Theory Exp.
**2017**, 2017, 073405. [Google Scholar] [CrossRef] - Wu, Z.; Menichetti, G.; Rahmede, C.; Bianconi, G. Emergent complex network geometry. Sci. Rep.
**2015**, 5, 10073. [Google Scholar] [CrossRef] [PubMed] - Bianconi, G.; Rahmede, C. Network geometry with flavor: From complexity to quantum geometry. Phys. Rev. E
**2016**, 93, 032315. [Google Scholar] [CrossRef] [PubMed] - Bianconi, G.; Rahmede, C. Emergent hyperbolic network geometry. Sci. Rep.
**2017**, 7, 41974. [Google Scholar] [CrossRef] [PubMed] - Bianconi, G. The entropy of randomized network ensembles. EPL (Europhys. Lett.)
**2007**, 81, 28005. [Google Scholar] [CrossRef] - Bianconi, G. Entropy of network ensembles. Phys. Rev. E
**2009**, 79, 036114. [Google Scholar] [CrossRef] [PubMed] - Anand, K.; Bianconi, G. Entropy measures for networks: Toward an information theory of complex topologies. Phys. Rev. E
**2009**, 80, 045102. [Google Scholar] [CrossRef] [PubMed] - Anand, K.; Bianconi, G. Gibbs entropy of network ensembles by cavity methods. Phys. Rev. E
**2010**, 82, 011116. [Google Scholar] [CrossRef] [PubMed] - Sagarra, O.; Vicente, C.P.; Dïaz-Guilera, A. Statistical mechanics of multiedge networks. Phys. Rev. E
**2013**, 88, 062806. [Google Scholar] [CrossRef] [PubMed] - Squartini, T.; de Mol, J.; den Hollander, F.; Garlaschelli, D. Breaking of ensemble equivalence in networks. Phys. Rev. Lett.
**2015**, 115, 268701. [Google Scholar] [CrossRef] [PubMed] - Snijders, T.A.; Pattison, P.E.; Robins, G.L.; Handcock, M.S. New specifications for exponential random graph models. Sociol. Methodol.
**2006**, 36, 99–153. [Google Scholar] [CrossRef] - Park, J.; Newman, M.E. Statistical mechanics of networks. Phys. Rev. E
**2004**, 70, 066117. [Google Scholar] [CrossRef] [PubMed] - Garlaschelli, D.; Loffredo, M. Maximum likelihood: Extracting unbiased information from complex networks. Phys. Rev. E
**2008**, 78, 015101. [Google Scholar] [CrossRef] [PubMed] - Peixoto, T.P. Hierarchical block structures and high-resolution model selection in large networks. Phys. Rev. X
**2014**, 4, 011047. [Google Scholar] [CrossRef] - Peixoto, T.P. Entropy of stochastic blockmodel ensembles. Phys. Rev. E
**2012**, 85, 056122. [Google Scholar] [CrossRef] [PubMed] - Goldenberg, A.; Zheng, A.X.; Fienberg, S.E.; Airoldi, E.M. A survey of statistical network models. Found. Trends Mach. Learn.
**2010**, 2, 129–233. [Google Scholar] [CrossRef] [Green Version] - Kallenberg, O. Foundations of Modern Probability; Springer: New York, NY, USA, 2002. [Google Scholar]
- Shalizi, C.R.; Rinaldo, A. Consistency under sampling of exponential random graph models. Ann. Stat.
**2013**, 41, 508–535. [Google Scholar] [CrossRef] [PubMed] - Spencer, N.; Shalizi, C.R. Projective Sparse Latent Space Network Models. arXiv, 2017; arXiv:1709.09702. [Google Scholar]
- Aldous, D.J. Representations for partially exchangeable arrays of random variables. J. Multivar. Anal.
**1981**, 11, 581–598. [Google Scholar] [CrossRef] - Diaconis, P.; Janson, S. Graph Limits and Exchangeable Random Graphs. Rend. Mat. Appl.
**2008**, 28, 33–61. [Google Scholar] - Krioukov, D.; Ostilli, M. Duality between equilibrium and growing networks. Phys. Rev. E
**2013**, 88, 022808. [Google Scholar] [CrossRef] [PubMed] - Borgs, C.; Chayes, J.T.; Cohn, H.; Zhao, Y. An L
^{p}theory of sparse graph convergence I: Limits, sparse random graph models, and power law distributions. arXiv, 2014; arXiv:1401.2906. [Google Scholar] - Caron, F.; Fox, E.B. Sparse graphs using exchangeable random measures. J. R. Stat. Soc. Ser. B (Stat. Methodol.)
**2017**, 79, 1295–1366. [Google Scholar] [CrossRef] [PubMed] - Veitch, V.; Roy, D.M. The Class of Random Graphs Arising from Exchangeable Random Measures. arXiv, 2015; arXiv:1512.03099. [Google Scholar]
- Borgs, C.; Chayes, J.T.; Cohn, H.; Holden, N. Sparse exchangeable graphs and their limits via graphon processes. arXiv, 2016; arXiv:1601.07134. [Google Scholar]
- Crane, H.; Dempsey, W. Edge exchangeable models for network data. arXiv, 2016; arXiv:1603.04571. [Google Scholar]
- Cai, D.; Campbell, T.; Broderick, T. Edge-exchangeable graphs and sparsity. In Proceedings of the 30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain, 5–10 December 2016; pp. 4249–4257. [Google Scholar]
- Janson, S. On Edge Exchangeable Random Graphs. J. Stat. Phys.
**2017**, 6, 1–37. [Google Scholar] [CrossRef] - van der Hoorn, P.; Lippner, G.; Krioukov, D. Sparse Maximum-Entropy Random Graphs with a Given Power-Law Degree Distribution. J. Stat. Phys.
**2017**, 2, 1–39. [Google Scholar] [CrossRef] - Bianconi, G.; Pin, P.; Marsili, M. Assessing the relevance of node features for network structure. Proc. Natl. Acad. Sci. USA
**2009**, 106, 11433–11438. [Google Scholar] [CrossRef] [PubMed] - We note here that while in the statistics literature the term sparse network refers to a network where the average degree is sublinear in the number of nodes N, i.e., 〈k〉 = o(N) here we adopt the and use the term sparse network to indicate networks with average degree independent of N, i.e., 〈k〉 = O(1). These networks are also indicated in the statistical literature as ultra-sparse.
- Bianconi, G. Mean field solution of the Ising model on a Barabási–Albert network. Phys. Lett. A
**2002**, 303, 166–168. [Google Scholar] [CrossRef] - Zhao, K.; Halu, A.; Severini, S.; Bianconi, G. Entropy rate of nonequilibrium growing networks. Phys. Rev. E
**2011**, 84, 066113. [Google Scholar] [CrossRef] [PubMed] - Dorogovtsev, S.N.; Mendes, J.F. Evolution of Networks: From Biological Nets to the Internet and WWW; Oxford University Press: Oxford, UK, 2013. [Google Scholar]
- Leskovec, J.; Kleinberg, J.; Faloutsos, C. Graphs over time: Densification laws, shrinking diameters and possible explanations. In Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, Chicago, IL, USA, 21–24 August 2005; pp. 177–187. [Google Scholar]
- Gehrke, J.; Ginsparg, P.; Kleinberg, J. Overview of the 2003 KDD Cup. ACM SIGKDD Explor. Newsl.
**2003**, 5, 149–151. [Google Scholar] [CrossRef] - Leskovec, J.; Lang, K.J.; Dasgupta, A.; Mahoney, M.W. Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. Int. Math.
**2009**, 6, 29–123. [Google Scholar] [CrossRef] - Albert, R.; Jeong, H.; Barabási, A.L. Internet: Diameter of the world-wide web. Nature
**1999**, 401, 130. [Google Scholar] [CrossRef]

**Figure 1.**The degree distributions $P\left(k\right)$ of the three analysed datasets is compared with the results of the model generated by using all the nodes of the network or with just a subsample of nodes of the network of size N. Panels (

**a**–

**c**) display the results for the arxiv hep-ph citation network [45,46] ($N=$ 34,546) the Berkeley-Stanford web network [47] ($N=$ 685,546) and the Notre Dame web network [48] ($N=$ 325,000) respectively.

**Figure 2.**The average degree ${k}_{nn}\left(k\right)$ of the neighbour of a node of degree k of the three analysed datasets is compared with the results of the model generated by using all the nodes of the network or with just a subsample of nodes of the network of size N. Panels (

**a**–

**c**) display the results for the arxiv hep-ph citation network [45,46] ($N=$ 34,546) the Berkeley-Stanford web network [47] ($N=$ 685,546) and the Notre Dame web network [48] ($N=$ 325,000) respectively.

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Kartun-Giles, A.P.; Krioukov, D.; Gleeson, J.P.; Moreno, Y.; Bianconi, G.
Sparse Power-Law Network Model for Reliable Statistical Predictions Based on Sampled Data. *Entropy* **2018**, *20*, 257.
https://doi.org/10.3390/e20040257

**AMA Style**

Kartun-Giles AP, Krioukov D, Gleeson JP, Moreno Y, Bianconi G.
Sparse Power-Law Network Model for Reliable Statistical Predictions Based on Sampled Data. *Entropy*. 2018; 20(4):257.
https://doi.org/10.3390/e20040257

**Chicago/Turabian Style**

Kartun-Giles, Alexander P., Dmitri Krioukov, James P. Gleeson, Yamir Moreno, and Ginestra Bianconi.
2018. "Sparse Power-Law Network Model for Reliable Statistical Predictions Based on Sampled Data" *Entropy* 20, no. 4: 257.
https://doi.org/10.3390/e20040257