Low-temperature behaviour of social and economic networks

Real-world social and economic networks typically display a number of particular topological properties, such as a giant connected component, a broad degree distribution, the small-world property and the presence of communities of densely interconnected nodes. Several models, including ensembles of networks also known in social science as Exponential Random Graphs, have been proposed with the aim of reproducing each of these properties in isolation. Here we define a generalized ensemble of graphs by introducing the concept of graph temperature, controlling the degree of topological optimization of a network. We consider the temperature-dependent version of both existing and novel models and show that all the aforementioned topological properties can be simultaneously understood as the natural outcomes of an optimized, low-temperature topology. We also show that seemingly different graph models, as well as techniques used to extract information from real networks, are all found to be particular low-temperature cases of the same generalized formalism. One such technique allows us to extend our approach to real weighted networks. Our results suggest that a low graph temperature might be an ubiquitous property of real socio-economic networks, placing conditions on the diffusion of information across these systems.

the link density of the network and the probability of connections [9]. However, in order to have a global parameter coupled not only to the number of links, but also to any other topological property of the network, we also introduce the graph temperature T . We therefore define a generalized ensemble where the probability of graph A is given by where E A is the energy of the particular graph A (a function of one or more topological properties of A, to be specified in each particular model) and is the grand partition function of the ensemble. Note that when T → ∞ we have P A = 2 −N (N −1)/2 for 91 all graphs, while when T → 0 we have P A = 1 for the graph with the maximum value of µL A − E A (or 92 P A = M −1 if there are M degenerate such graphs), and P A = 0 for all other graphs.

93
The temperature in eq.(1) might appear to be redundant, since the parameter T can be in principle 94 reabsorbed in a redefinition of E A and µ without loss of generality. In other words, all choices of 95 parameters that lead to the same values of E A /T and µ/T will generate indistinguishable results, 96 meaning that the value of T is indeterminate. While this is mathematically true, there is a definite 97 'physical' benefit in including the temperature as an additional parameter. As we discuss below, the 98 benefit is that of incorporating in T all the 'collective effects' arising in large networks, while leaving In general, being a combination of topological properties, the energy E A can be an arbitrarily complicated function of the adjacency matrix A, but throughout the present paper we consider the simple and instructive case, explored in many models, where it can be written as a sum over the individual link energies ij [6,9,12]: As we show below, this choice can -despite its simplicity -give rise not only to random graphs, but also to complex scale-free networks, small-worlds, networks with correlations, clustering, and community structure. The partition function reads and the graph probability is where p ij (T ) = 1 e ( ij −µ)/T + 1 (6) is the probability that a link between i and j exists. Equation (6) has the usual form of Fermi statistics 106 (alternative derivations of the above expression for p ij are given in refs. [6,9,20] for T = 1). Therefore 107 the additivity of E A implies that each link is drawn independently with probability p ij .

108
If the form of ij is further simplified, many important network models are obtained as particular 109 cases of eq.(6), including hidden-variable models, the configuration model and random graphs [6]. 110 We shall introduce the temperature-dependent version of these models in what follows. We shall also 111 exploit eq.(6) to study a temperature-dependent small-world model, a model with community structure, 112 and ensembles of binary graphs derived from real-world weighted networks. Therefore eq.(6) gives rise 113 to a rich phenomenology and will be of central importance throughout the paper.

115
Before considering particular cases, let us first note some general properties of eq.(6). Note that, 116 independently of T , p ij > 1/2 when ij < µ and p ij < 1/2 when ij > µ. It is interesting to consider 117 the infinite-and zero-temperature limits, as well as the 'classical' one.

118
When T → +∞, eq.(6) implies that irrespective of the values of ij and hence of the differences in the cost of links. As a consequence, 119 the network is a random graph with p = 1/2 and is therefore trivial. Note that in this case any two 120 configurations A and B become equiprobable (P A = P B ).

121
When T = 0, we instead have where Θ(x) = 1 if x > 0 and Θ(x) = 0 if x < 0. Technically, we should define Θ(0) = 1/2 in order to 122 capture the correct behaviour of eq.(6), even if we will not encounter this situation in what follows. The 123 above equation means that only those pairs of vertices for which ij < µ are connected. This is analogous 124 to the well-known degenerate behaviour of Fermions at zero temperature, and µ is also termed the Fermi 125 energy F = µ. This clarifies the role of µ as the available energy per link when T → 0: at absolute zero, 126 only the topology with the minimum value of E A − µL A can be realized. This topology is obtained by 127 drawing all and only the links with ij < µ.

128
A final general comment is that eq.(6) reduces to the 'classical limit' [6] We will consider the above limit in some applications later on.
well-defined in the large N limit), and the collective (network-wide) effects are reabsorbed in T . As 137 we have anticipated above, this is the main added value of isolating T from the other parameters of the 138 model, and the ultimate reason why we believe that investigating the temperature dependence of network 139 ensembles is important. As a final remark, we require ij , µ and T to be dimensionless. If we imagine 140 that ij is (a function of) an empirically measurable quantity such as distance or money, an dimensionless 141 specification can be achieved by assuming that both ij and µ have been preliminarily divided by some 142 appropriately averaged (either over vertices or vertex pairs) value of ij , and by simply considering T as 143 a dimensionless parameter. We will discuss this point in each of the following examples. In what follows we consider various specific cases. The simplest scenario is when all link energies are equal: ij = . This yields a temperature-dependent random graph of the Erdös-Rényi type, since all probabilities p ij are equal to Note that, if we assume that the has been divided by its average value over all pairs of vertices to 146 make it dimensionless, we should simply set = 1. When looking at the above formula, as well as the 147 following ones, this is the value of that we should keep in mind.

148
While the properties of the random graph are well known, in our framework some intriguing results emerge as the temperature is varied, and in particular when T → 0. First of all, we note that implying that the graph is either fully connected (µ > ) or empty (µ < ). 1 This result provides us with 149 a useful (for our purposes in what follows) definition of 'sparseness' of a network. We define a random 150 graph as sparse (dense) if > µ ( < µ), since when T → 0 the graph becomes empty (fully connected).

151
This means that, at finite temperature, a sparse graph (as defined above) will be such that p(T ) < 1/2 152 and a dense graph will be such that p( Before considering other models, it is quite interesting to consider the percolation transition marking the onset of a giant connected component in an infinitely large random graph. For random graphs, it is well known that this transition occurs when the connection probability p is set to the critical value p c ∼ 1/N , i.e. when the function f (N ) introduced above is f (N ) ∼ N . In our framework, since and µ are fixed, we can regard the phase transition as temperature-dependent. If > µ then p(T ) > 1/2 > p c at all temperatures, meaning that dense graphs are obviously always above the critical threshold. If < µ, there is a critical percolation temperature T c such that p(T c ) = p c ∼ 1/N . Inverting, we find that for sparse graphs We note that the link density of most real-world socio-economic networks is (significantly) smaller than 1/2. This means that, when modeled as random graphs (i.e. when considering a connection probability p equal to the observed link density f ), real networks systematically fall in the 'sparse graph' category and are therefore such that > µ. It should also be noted that in most cases the observed density typically decays as 1/f (N ) where f (N ) is an increasing function of N . This means that, in order to reproduce the empirically observed density, random graphs should be such that which implies This result shows that larger graphs have a smaller temperature, providing a first indication of the fact that 165 large real-world networks might be generally characterized by a small value of the graph temperature.

166
It is also important to note that, for most observed networks, f (N ) cN with c 1. In combination 167 with eq.(12) this means that, when modeled as random graphs, large real-world networks have a low 168 but non-zero temperature, i.e. they are 'just above' the percolation threshold. This is enough to ensure Another case of great interest is when the link energy in eq.(3) is the sum of two single-vertex contributions: For future convenience, we assume that i ≤ 0 ∀i (this can always be achieved by an irrelevant overall 183 shift in the energies i → i − max ≤ 0). Moreover, to have a dimensionless quantity we imagine that 184 i (and similarly j ) has been preliminary divided by the absolute value of its average over all vertices.

185
After these operations, we therefore have = −1 where the bar denotes an average of i over vertices.

186
The above choice leads to the graph energy where k i ≡ j a ij is the degree (number of links) of vertex i. Note that all graphs A with the same degrees have the same energy E A and are therefore equiprobable. This case therefore represents the grand-canonical version of the so-called Configuration Model, i.e. a model of random networks with given degrees [6]. It can also be regarded as a particular case of the class of Fitness Models [17] where each node i is characterized by a 'fitness' or 'hidden variable' x i determining the connection probability. The novelty of our approach is that the node fitness x i ≡ e − i /T and the 'fugacity' z ≡ e µ/T (in terms of which the model is conveniently described [20,22]) now depend on T . We can therefore write  To highlight the role of T , we now rephrase the above results in terms of the energies i . For convenience we introduce the non-negative quantity φ i ≡ − i ≥ 0, which measures the tendency of vertex i to form connections [17]. Similarly, we define φ 0 ≡ −µ. Now, if we want x to be distributed according to (where 1 ≤ x < +∞ and γ > 1), then the quantity φ i = − i = T ln x i must be distributed according to Now, since φ does not depend on T , q(φ) must be T -independent as well. The only possibility is therefore On the other hand, since φ i = − i and = −1, we also have φ = − = 1. This means that we 199 must set λ = 1. This yields γ = 1 + T and which is an important result showing how T determines ρ(x) and consequently the topology of the 201 network.

202
For instance, in the classical limit (9) we recover the T -dependent version of a model studied in ref.
[17]: since In this case there are no degree correlations due to the factorization of p ij (T ).

203
In the more general case (i.e. outside the 'classical' regime), P (k) has a power-law region with an 204 exponent that is still an increasing function of T , followed by a cut-off arising from the saturation of 205 p ij (T ). The power-law region narrows as T increases. This qualitative behaviour can be characterized 206 rigorously by computingk i as a function of x i or φ i , and inverting this relation to find P (k) from ρ(x) or 207 q(φ). This is not easy in general, but here we show that in the three paradigmatic cases T = +∞, T = 1 208 and T = 0 it can be done successfully.  For T = 1, denoting p ij (T ) = p(φ i , φ j ) the expected degree of a vertex with fitness φ can be evaluated 216 as the integral which is an increasing function of x and is therefore invertible. If x(k) denotes the inverse function, the expected degree distribution is P (k) = ρ[x(k)]dx/dk. Note thatk ∝ x for small x, whilek → N for large x. Thus in the linear regime (smallk) we have x ∝k as in the classical limit, so that dx/dk is constant and This scale-free region is followed by a cut-off for large k, corresponding to the 'saturated' behaviour. Finally, when T = 0 the expression for ρ(x) in eq.(22) breaks down since all the x i 's become infinite, and from eq.(8) we find Surprisingly, this coincides with another model introduced in ref.
[17], which precisely assumes q(φ) = e −φ as in eq.(21) and thus turns out to be the zero-temperature limit of our general model. This model is intriguing since a derivation similar to that in eq.(24) shows that it yields a purely scale-free degree distribution (now without cut-off), even if no power-laws are introduced 'by hand' in the model [17,18]. Moreover, the model displays anticorrelation between degrees: the average nearest neighbour degree scales as and the clustering coefficient scales asc (times logarithmic corrections) [17,18]. 220 We note that, while in ref.
[17] the above model was proposed as an alternative way to produce 221 scale-free networks, different from the specification leading to eq.(23), here we find that both choices are 222 actually two particular cases of the same temperature-dependent model. We also note that eq.(27) cannot 223 be retrieved as the zero-temperature limit of eq.(23), since in such limit the 'classical' approximation 224 (9) is no longer valid. Rather, the above results show that as T goes to zero the exponent of the degree 225 distribution approaches −2, with a gradual disappearance of the upper cut-off. Moreover, we stress that 226 while the topological properties of the network depend on both the temperature T and the chemical 227 potential µ, the latter strongly determines the mean of the degree distribution (i.e. the link density) but 228 not its functional form, which is instead mainly determined by T .

230
Taken together, the above results lead to the following intriguing conclusion: in this model, correlated 231 scale-free networks with exponent −2 naturally arise as the optimized topology at zero temperature. As T 232 grows, the correlations become weaker, the exponent of P (k) increases and a cut-off appears destroying 233 the purely scale-free behaviour, until for T → ∞ the network becomes an uncorrelated random graph 234 with a Poisson degree distribution. In our framework, it is clear that φ 0 plays the role of a Fermi energy. 235 We can also interpret the correlations at T = 0 as the collective need to minimise the total energy, an 236 effect that gradually weakens as T increases. We now make some important considerations about the temperature of real-world binary networks. The degree distribution of most real scale-free networks has a broad tail of the form The above observed range of the exponent is another remarkable indication that real networks are consistent with a low-temperature model. In particular, all binary scale-free networks in the 'classical regime' described by eq.(23) are consistent with a temperature Scale-free networks outside the classical regime are instead characterized by an even lower temperature, 239 since we have shown that γ = 2 is realized at T = 0. For these networks, a small positive value 240 T binary > 0 is already enough to produce a realistic degree distribution with γ > 2. 241 We finally note that, if one has access to the empirical distribution ρ(x), one can measure T binary for 242 any real network which is well described by eq.(18), even if this network is not scale-free. This is for pair of vertices has an associated quantity φ ij ≡ − ij drawn from a distribution q(φ), and a probability 252 p ij = p(φ ij ) to exist. This corresponds to the general case defined by eqs. (3) and (6).

253
The vanishing of the percolation threshold as shown previously in eq.(13) for the random graph example is actually a more general result, and holds even when different pairs of vertices have different values of ij as in eq.(6), i.e. when min ≤ ij ≤ max < µ. In this case, we must have where In what follows we consider three particular cases of the In the simplest situation, the energy ij of a link is simply proportional to the distance d ij between its end-point vertices. If we imagine that both ij and d ij have been made dimensionless by dividing each of them by the respective average value over all pairs of vertices, the proportionality constant drops out and we can simply write This implies so that the probability of a link being there between i and j reads Let us first consider the zero-temperature behaviour. The above probability becomes which is nothing but the definition of a local 'metric' network connecting the geometrically closest  At zero temperature, eq.(38) implies that if d < µ < 2d then the network is a ring with first-neighbour 273 interactions (as in Fig. 1), if 2d < µ < 3d then the network is a ring with second-neighbour interactions, Note that, if we allow the chemical potential to take precisely the integer value µ = md, then the pairs of vertices separated by a distance d ij = md will be connected with probability p ij (0) = 1/2, adding a sort of 'random anomaly' to the ring-like or lattice structure. For this reason, we have deliberately restricted µ to take the non-integer values md < µ < (m + 1)d so that µ = md. Figure 1. A temperature-dependent small-world model with vertices arranged in a circle and chemical potential d < µ < 2d (where d is the dimensionless distance between nearest neighbours along the circle). When T = 0 (left), the network is a ring with first-neighbour interactions. When T = ∞ (right), the network is a random graph with connection probability p = 1/2. When T = 1 (center), the network is a 'small-world' with a few long-range connections and an incomplete circular 'backbone'.

T=0! T=∞! T=1!
If the temperature is slightly increased from T = 0 to a small positive value, then these regular 279 ring-like or lattice structures will be perturbed, with a small number of short-range connections 280 being replaced by longer-range ones (see fig. 1). At higher temperature, the zero-temperature   high-temperature/high-rewiring regimes.

299
In the limit of low rewiring probability, the standard WS model exhibits the so-called 'small-world' 300 effect, i.e. the combination of a large value of the clustering coefficient (measuring the average fraction of realized triangles at each node) and of a small value of the average vertex-vertex distance (which 302 increases only logarithmically with the size of the graph) [1]. Since these properties are found in the 303 low-rewiring regime, we can expect that they would be generated also in the low-temperature regime 304 of our model. Again, this means that the empirically observed properties (in this case the small-world 305 effect) are reproduced for small positive values of the graph temperature. 306 6.2. Scale-free small-worlds 307 As for the random graph model, we know that the simple small-world model (either the original WS 308 one or our temperature-dependent reformulation above) does not reproduce the broad degree distribution 309 so widely observed in real networks. Here, we briefly discuss how our model above can be extended in 310 order to account for a heterogeneous -and if necessary, scale-free -degree distribution.

311
To this end, we combine two models considered so far, by assuming that the link energy in eq.(3) is determined not only by distances, as in the above model, but also by vertex-specific properties, as in eq.(16). This leads to where now we imagine that d ij has been divided by its average over all pairs of vertices, while i and j have been divided by their average over all vertices. Correspondingly, eq.(3) becomes and eq.(6) becomes Clearly, a sufficiently heterogeneous distribution of the values of i will induce a broad degree distribution, exactly as we showed in sec. 4. In particular, a suitable choice allows to reproduce the scale-free and small-world properties simultaneously. However, a general conclusion one can learn from this model is that, if the distances arise from a homogeneous spatial distribution of vertices and if the degree distribution induced by eq.(40) is very broad, this typically means that, while the distribution of the sums i + j is very broad, the distribution of the distances d ij is much more narrowly concentrated around its average value d = 1. Looking at eq.(39), this means that the distribution of ij is mainly determined by that of i + j , i.e. we can make the approximation zero temperature all such components are in any case complete cliques (see fig. 2a-b).

362
For small but positive T , the zero-temperature structure will be perturbed into a finite-temperature 363 one where the original cliques become 'modules' of densely (but not completely) connected vertices, 364 with a few links connecting different modules (see fig. 2c-d). This is precisely the kind of community 365 structure that is observed in most real socio-economic networks [4]. When T becomes large, more 366 missing links will be produced within communities and more links will be produced among them, until 367 the intra-community and inter-community densities equalize to the common value 1/2 in the limit T → 368 +∞. So, again, we find that in order to reproduce the empirical properties of socio-economic networks 369 (where the contrast between inter-and intra-community density is very marked, but at the same time

379
It should be noted that in this case the range of variability of d ij can be much broader than in the nonultrametric case, because here small intra-branch taxonomic distances coexist with large inter-branch ones. This means that now the approximation in eq.(42) is no longer legitimate, and we cannot reduce our model with community structure to the one without it. Rather, for all pairs of vertices i and j within the same community C µ (as specified by µ when T = 0) we now have the inequality The above expression only holds within each community, while across communities the opposite 380 inequality applies, confirming that now distances cannot be reabsorbed in a unique value of the 381 chemical potential µ. In other words, while our discussion in sec. 6 suggested that the scale-free 382 property automatically ensures the small-world one, here we find that the scale-free property does not 383 automatically ensure the presence of community structure (and vice versa). Of course, when the model 384 considered here displays a sufficiently heterogeneous degree distribution it will also automatically imply Our 'ultrametric small-world model' as a function of temperature T and chemical potential µ. Nodes (blue circles) are leaves of a dendrogram (black lines), separated by an ultrametric distance d ij (increasing along the purple axis) representing the height of the closest branching point separating vertices i and j. The ultrametric distances determine the topology of the network (lying on the horizontal purple plane): a) when T = 0 and µ is small, the network is divided into many small cliques (blue links) corresponding to the disconnected branches obtained by 'cutting' the dendrogram along the orange dashed line determined by µ; b) when T = 0 and µ is larger, the network is divided into fewer and larger cliques; c) when T 0 and µ is small, there are many small communities that are highly connected internally (blue links) and sparsely connected across (red links); d) when T 0 and µ is larger, there are fewer and larger communities, with a higher density constrast between intra-community (blue) and inter-community (red) links. After introducing an appropriate degree of heterogeneity at the level of vertices, this model can be turned into our 'ultrametric scale-free model' where a non-trivial community structure coexists with a broad degree distribution.
the small-world property, along the lines discussed in sec. 6.
We can therefore conclude that the model defined by eqs.(39)-(41), where d ij is an ultrametric 388 distance, is a simple but highly nontrivial one. Since throughout this paper we have not been interested 389 in reproducing a particular network, but rather a class of generic empirical properties, we will not 390 consider any specific d ij but merely note that, despite its simplicity, the above model is able to reproduce networks, but to also enable a realistic simulation of distance-dependent dynamics of information Turning to eq.(6), if we require p ij = 0 when w ij = 0 and p ij = 1 when w ij = +∞, we find that w ij must be proportional to the link fitness e − ij /T . In other words, the weights must depend on T , which corresponds to the property that at low T (more heterogeneous weights) some pairs of vertices are much more likely to be connected than other pairs, while at high T (more homogeneous weights) all pairs of vertices tend to have a similar connection probability. Now, many real networks [29-33] display a power-law distribution of non-zero link weights of the form If we restrict ourselves to pairs of vertices with w ij > 0 and define is the minimum non-zero weight for a given network), corresponding to the preliminary rescaling ij → 433 ij − max , then we can repeat the arguments leading to eq.(22). Specifically, we set ij ≡ −T ln x ij ≤ 0 where now ρ(x) and q(φ) are distributions not over vertices, but over pairs of them (specifically, over the pairs with non-zero weights). This allows us to compute the temperature of real networks with power-law distributed weights as The empirical values of α found in various weighted networks [29-33] are summarized in Table 1 We have therefore found that a general mapping from weights to probabilities is given by where x ij ≡ w ij /w min and z ≡ e µ/T is a free parameter. Note that the above expression works for both zero and non-zero weights. We also note that the classical limit (9) of this expression reads p ij ≈ zx ij , and if we choose z = w min /w max we have which is approximately equivalent to the choice explored by us in ref.
[19]. with weight such that x ij (T ) > z −1 (T ) in the limit T → 0 are selected and the others are discarded.

445
Interestingly, since the ordering of the weights is preserved at all temperatures, this corresponds to 446 a standard thresholding procedure, adopted for instance in ref.
[34] to filter stock correlations and in 447 ref.
[35] to extract minimum spanning trees from real foodwebs. These filtering techniques discard 448 most of the information contained in the weights, resulting in a single (threshold-dependent) binary 449 graph. Here we find that this corresponds to the zero-temperature limit of our mapping from weighted 450 networks to ensembles of binary graphs. Our results extend these techniques to the finite temperature 451 case, making it possible to preserve the heterogeneity of the links and explore the whole ensemble of 452 possible configurations with the appropriate probabilities. We expect that this will represent an improved 453 filtering technique, with a significantly reduced information loss.

455
We have introduced the concept of 'graph temperature', which can vary from zero to infinity, in order 456 to explore the behaviour of networks in the limit of large network size while keeping the local properties 457 well-defined. Since our methodology makes use of statistical graph ensembles that extend the class of 458 Exponential Random Graphs widely used in social network analysis, it has a natural application as a 459 generalized model of social and economic networks. We showed that many structural properties that 460 are ubiquitous in socio-economic networks can be simply understood as the effects of an optimized 461 low-temperature behaviour resulting from 'connectivity costs', and confirmed this by measuring the 462 temperature of both binary and weighted real-world scale-free networks. Furthermore we have also 463 shown that a variety of different models and techniques can in fact be regarded as particular cases of 464 a more general temperature-dependent formalism. We believe that our results provide an intuitive and unified understanding of many properties of real socio-economic networks, from their scale-free and 466 small-world behaviour to their hierarchical community structure.